✅ Optimize the encryption and decryption logic and fix the issues in the test
This commit is contained in:
182
BENCHMARK.md
Normal file
182
BENCHMARK.md
Normal file
@@ -0,0 +1,182 @@
|
||||
# go-xcipher Performance Benchmark Guide
|
||||
|
||||
[中文版本](BENCHMARK_CN.md)
|
||||
|
||||
This document provides guidelines for running performance benchmarks of the go-xcipher library and interpreting the test results.
|
||||
|
||||
## Test Overview
|
||||
|
||||
These benchmarks aim to comprehensively compare the performance of the go-xcipher library with the encryption functions in the Go standard library. The tests include:
|
||||
|
||||
1. Basic encryption/decryption performance tests
|
||||
2. Stream encryption/decryption performance tests
|
||||
3. Multi-core scaling performance tests
|
||||
4. Hardware acceleration performance tests
|
||||
5. Memory usage efficiency tests
|
||||
6. Performance matrix tests for different algorithms and data sizes
|
||||
|
||||
## Running Tests
|
||||
|
||||
You can run the complete benchmark suite using:
|
||||
|
||||
```bash
|
||||
go test -bench=Benchmark -benchmem -benchtime=3s
|
||||
```
|
||||
|
||||
Or run specific tests:
|
||||
|
||||
```bash
|
||||
# Basic encryption performance test
|
||||
go test -bench=BenchmarkCompareEncrypt -benchmem
|
||||
|
||||
# Basic decryption performance test
|
||||
go test -bench=BenchmarkCompareDecrypt -benchmem
|
||||
|
||||
# Stream encryption performance test
|
||||
go test -bench=BenchmarkCompareStreamEncrypt -benchmem
|
||||
|
||||
# Stream decryption performance test
|
||||
go test -bench=BenchmarkCompareStreamDecrypt -benchmem
|
||||
|
||||
# Multi-core scaling performance test
|
||||
go test -bench=BenchmarkMultiCoreScaling -benchmem
|
||||
|
||||
# Hardware acceleration performance test
|
||||
go test -bench=BenchmarkHardwareAcceleration -benchmem
|
||||
|
||||
# Memory usage efficiency test
|
||||
go test -bench=BenchmarkMemoryUsage -benchmem
|
||||
|
||||
# Performance matrix test
|
||||
go test -bench=BenchmarkPerformanceMatrix -benchmem
|
||||
```
|
||||
|
||||
Get test guide and system information:
|
||||
|
||||
```bash
|
||||
go test -run=TestPrintBenchmarkGuide
|
||||
```
|
||||
|
||||
## Test Files Description
|
||||
|
||||
### 1. xcipher_bench_test.go
|
||||
|
||||
This file contains basic performance benchmarks, including:
|
||||
|
||||
- Encryption/decryption performance tests for different data sizes
|
||||
- Stream encryption/decryption performance tests
|
||||
- Parallel vs serial processing performance comparison
|
||||
- Performance tests with different buffer sizes
|
||||
- Impact of worker thread count on performance
|
||||
- File vs memory operation performance comparison
|
||||
- Zero-copy vs copy operation performance comparison
|
||||
- Adaptive parameter performance tests
|
||||
- CPU architecture optimization performance tests
|
||||
|
||||
### 2. stdlib_comparison_test.go
|
||||
|
||||
This file contains performance comparison tests with the standard library, including:
|
||||
|
||||
- Performance comparison with standard library ChaCha20Poly1305
|
||||
- Performance comparison with AES-GCM
|
||||
- Stream encryption/decryption performance comparison
|
||||
- Multi-core scaling tests
|
||||
- Hardware acceleration performance tests
|
||||
- Memory usage efficiency tests
|
||||
- Performance matrix tests
|
||||
|
||||
### 3. stability_test.go
|
||||
|
||||
This file contains stability tests, including:
|
||||
|
||||
- Long-running stability tests
|
||||
- Concurrent load tests
|
||||
- Fault tolerance tests
|
||||
- Resource constraint tests
|
||||
- Large data processing tests
|
||||
- Error handling tests
|
||||
|
||||
## Interpreting Test Results
|
||||
|
||||
Benchmark results typically have the following format:
|
||||
|
||||
```
|
||||
BenchmarkName-NumCPU iterations time/op B/op allocs/op
|
||||
```
|
||||
|
||||
Where:
|
||||
- `BenchmarkName`: Test name
|
||||
- `NumCPU`: Number of CPU cores used in the test
|
||||
- `iterations`: Number of iterations
|
||||
- `time/op`: Time per operation
|
||||
- `B/op`: Bytes allocated per operation
|
||||
- `allocs/op`: Number of memory allocations per operation
|
||||
|
||||
### Performance Evaluation Criteria
|
||||
|
||||
1. **Throughput (B/s)**: The `B/s` value in the test report indicates bytes processed per second, higher values indicate better performance.
|
||||
2. **Latency (ns/op)**: Average time per operation, lower values indicate better performance.
|
||||
3. **Memory Usage (B/op)**: Bytes allocated per operation, lower values indicate better memory efficiency.
|
||||
4. **Memory Allocations (allocs/op)**: Number of memory allocations per operation, lower values indicate less GC pressure.
|
||||
|
||||
### Key Performance Metrics Interpretation
|
||||
|
||||
1. **Small Data Performance**: For small data (1KB-4KB), low latency (low ns/op) is the key metric.
|
||||
2. **Large Data Performance**: For large data (1MB+), high throughput (high B/s) is the key metric.
|
||||
3. **Parallel Scalability**: The ratio of performance improvement as CPU cores increase reflects parallel scaling capability.
|
||||
|
||||
## Key Performance Comparisons
|
||||
|
||||
### XCipher vs Standard Library ChaCha20Poly1305
|
||||
|
||||
This comparison reflects the performance differences between XCipher's optimized ChaCha20Poly1305 implementation and the standard library implementation. XCipher should show advantages in:
|
||||
|
||||
1. Large data encryption/decryption throughput
|
||||
2. Multi-core parallel processing capability
|
||||
3. Memory usage efficiency
|
||||
4. Real-time stream processing capability
|
||||
|
||||
### XCipher vs AES-GCM
|
||||
|
||||
This comparison reflects performance differences between different encryption algorithms. On modern CPUs (especially those supporting AES-NI instruction set), AES-GCM may perform better in some cases, but ChaCha20Poly1305 shows more consistent performance across different hardware platforms.
|
||||
|
||||
## Influencing Factors
|
||||
|
||||
Test results may be affected by:
|
||||
|
||||
1. CPU architecture and instruction set support (AVX2, AVX, SSE4.1, NEON, AES-NI)
|
||||
2. Operating system scheduling and I/O state
|
||||
3. Go runtime version
|
||||
4. Other programs running simultaneously
|
||||
|
||||
## Special Test Descriptions
|
||||
|
||||
### Multi-core Scalability Test
|
||||
|
||||
This test demonstrates parallel processing capability by gradually increasing the number of CPU cores used. Ideally, performance should increase linearly with the number of cores.
|
||||
|
||||
### Stream Processing Tests
|
||||
|
||||
These tests simulate real-world stream data encryption/decryption scenarios by processing data in chunks. This is particularly important for handling large files or network traffic.
|
||||
|
||||
### Hardware Acceleration Test
|
||||
|
||||
This test shows performance comparisons of various algorithms on CPUs with specific hardware acceleration features (e.g., AVX2, AES-NI).
|
||||
|
||||
## Result Analysis Example
|
||||
|
||||
Here's a simplified result analysis example:
|
||||
|
||||
```
|
||||
BenchmarkCompareEncrypt/XCipher_Medium_64KB-8 10000 120000 ns/op 545.33 MB/s 65536 B/op 1 allocs/op
|
||||
BenchmarkCompareEncrypt/StdChaCha20Poly1305_Medium_64KB-8 8000 150000 ns/op 436.27 MB/s 131072 B/op 2 allocs/op
|
||||
```
|
||||
|
||||
Analysis:
|
||||
- XCipher processes 64KB data about 25% faster than the standard library (120000 ns/op vs 150000 ns/op)
|
||||
- XCipher's memory allocation is half that of the standard library (65536 B/op vs 131072 B/op)
|
||||
- XCipher has fewer memory allocations than the standard library (1 allocs/op vs 2 allocs/op)
|
||||
|
||||
## Continuous Optimization
|
||||
|
||||
Benchmarks are an important tool for continuously optimizing library performance. By regularly running these tests, you can detect performance regressions and guide further optimization work.
|
Reference in New Issue
Block a user