fully utilises the crypto engines as well as the other on-chip resources. When theresource has been fully utilised, the system is saturated. However, if more workload isinput, more arbitrations and control overhead will happen which have an impact onthe data throughput, but this throughput is relatively stable when more workload isinput. Figure 6a shows that the performance of the block cipher engines represented byDES is sensitive to the number of engines and the bus width as well as the number ofDMAs. Considering the performance/cost tradoff, 2 parallel DES engines with 2 W/RDMAs can generate nearly 2 Gbps data throughput but consume relatively littlepower and area. For Figure 6b, data throughput is less sensitive to the bus width andthe number of DMAs than to the number of engines because that public-key cipherengine has a much longer operating time so that the data transfer efficiency has littleeffect on the overall performance. Although the curves marked by triangle and reversetriangle provide much higher data throughput compared to the other curves, thehardware implementation under these configuration patterns are quite area and powerconsumptive. Hence, the configuration with 64 bit data bus width and 2 parallelpublic-key cipher engines is chosen which is sufficient for the overall systemperformance. The SHA-1 engine provides fewer throughput and smaller area thanDES engine, hence 4 SHA-1 engines are implemented to achieve the Gbps throughput.Following the same methodology and according to the specific performance/areatradeoff and design targets, the optimal design parameters can be chosen from theperformance evaluation results, which show that 4 parallel crypto engines aresufficient for a hash function with 2 parallel crypto engines for each kind of blockcipher and public-key cipher. The experiments also show that the configuration with2 CDMAs and 4 WDMAs/RDMAs is sufficient for internal data transferrequirements. The data import and export for each crypto engine are implementedwith FIFOs to facilitate the data transfer process. The optimal positioning of thecrypto engines makes the parallel processing possible; hence, different independenttasks can be processed simultaneously
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.