Skip to content

Conversation

@GOODBOY008
Copy link
Member

@GOODBOY008 GOODBOY008 commented Sep 14, 2025

Overview

This PR introduces a comprehensive benchmark performance testing module for FastExcel, implementing the proposal outlined in #572.

Benchmark Results Available

CI Benchmark Run Completed Successfully: https://github.com/GOODBOY008/fastexcel/actions/runs/17709908635

Benchmark Artifacts: The workflow generated comprehensive benchmark reports available in the artifacts:

  • HTML Reports: Interactive benchmark comparison reports
  • Raw Results: JMH benchmark data in JSON format
  • Analysis: Performance analysis and comparison metrics

To view the benchmark reports:

  1. Download the benchmark-results artifact from: https://github.com/GOODBOY008/fastexcel/actions/runs/17709908635
  2. Unzip the downloaded file
  3. Open benchmark-reports/benchmark-comparison.html in your browser

What's Changed

New Module: fastexcel-benchmark

• JMH Integration: Complete Maven configuration with industry-standard Java microbenchmarking framework
• Comprehensive Test Suites:
◦ Comparison benchmarks (FastExcel vs Apache POI)
◦ Memory efficiency specialized tests
◦ Streaming operation performance tests
◦ Microbenchmarks for core components
• Automated Execution: Multi-profile support with configurable dataset sizes and memory settings
• Advanced Features:
◦ Interactive CLI with scenario management
◦ Real-time memory profiling with GC tracking
◦ HTML visualization reports and JSON data export
◦ Performance trend analysis and regression detection

Key Components

  1. Core Framework (cn.idev.excel.benchmark.core)
    ◦ Abstract benchmark base classes
    ◦ Configuration management
    ◦ Memory profiler integration

  2. Test Scenarios (cn.idev.excel.benchmark.*)
    ◦ Read/Write operation benchmarks
    ◦ Fill operation performance tests
    ◦ Streaming benchmarks for large datasets
    ◦ Memory efficiency analysis

  3. Comparison Benchmarks (cn.idev.excel.benchmark.comparison)
    ◦ Direct FastExcel vs Apache POI performance comparison
    ◦ Multi-dimensional analysis (throughput, latency, memory)

  4. Utilities (cn.idev.excel.benchmark.utils)
    ◦ Test data generation
    ◦ File management utilities
    ◦ Reporting and visualization

  5. Automated Scripts (scripts/benchmark-runner.sh)
    ◦ Profile-based execution (quick/standard/comprehensive)
    ◦ Configurable parameters and output formats
    ◦ Regression analysis automation

GitHub Actions Integration

Workflow (.github/workflows/benchmark.yml)
◦ Manual trigger with workflow_dispatch for on-demand benchmarking
◦ Java 11 setup with proper classpath resolution
◦ Automated artifact upload for benchmark results
◦ Fixed JMH forking issues for reliable results

Test Scenarios Coverage

• Data Scales: SMALL(1K) → MEDIUM(10K) → LARGE(100K) → EXTRA_LARGE(1M+)
• File Formats: XLSX
• Operation Types: Read, Write, Fill, Streaming
• Memory Analysis: Real-time monitoring, GC pressure analysis, allocation patterns

Benefits

  1. Validates Performance Claims: Provides empirical evidence for FastExcel's performance advantages
  2. Quality Assurance: Enables systematic performance analysis and regression detection
  3. User Confidence: Transparent performance reports for informed decision-making
  4. Development Guidance: Data-driven optimization insights

Closes #572

@GOODBOY008 GOODBOY008 changed the title feat: Add comprehensive benchmark comparison workflow for FastExcel vs Apache POI feat: Introduce FastExcel Benchmark Performance Testing Module Sep 14, 2025
@GOODBOY008
Copy link
Member Author

GOODBOY008 commented Sep 14, 2025

@delei @alaahong

CI Benchmark Run Completed Successfully: https://github.com/GOODBOY008/fastexcel/actions/runs/17709908635

To view the benchmark reports:

  1. Download the benchmark-results artifact from: https://github.com/GOODBOY008/fastexcel/actions/runs/17709908635/artifacts/4006114037
  2. Unzip the downloaded file
  3. Open benchmark-reports/benchmark-comparison.html in your browser

There are a few issues to address:

  1. In the Performance Comparisons section of the HTML report, the content is incomplete. A dataset and a format column need to be added.
  2. For the 1M dataset scenario, the POI run failed, so no benchmark results were generated.

@psxjoy
Copy link
Member

psxjoy commented Sep 14, 2025

I'm really excited about this PR. However, it's quite large, so the code review will take some time.

Also, no offense intended, but I'd like to ask: Did you use AI-generated code in this PR?

@GOODBOY008
Copy link
Member Author

I'm really excited about this PR. However, it's quite large, so the code review will take some time.

Also, no offense intended, but I'd like to ask: Did you use AI-generated code in this PR?

@psxjoy Yes, some parts (like the comparison report, memory profiler logic, and quickstart scripts) were AI-assisted.AI is quite effective in these scenarios, I’ve verified them to make sure they work correctly.

I noticed the artifact wasn’t accessible, so I’ve uploaded the results for your review.
benchmark-results.zip

@psxjoy psxjoy added PR: developing This feature will be added in future releases discussion welcome Welcome to join the discussion together enhancement New feature or request labels Sep 14, 2025
@delei
Copy link
Member

delei commented Sep 18, 2025

Hi, @GOODBOY008
Thank you for submitting the PR.

Regarding this PR, I still have some questions:

  • It seems that the file ./fastexcel-benchmark/scripts/benchmark-runner.sh does not exist?
  • Introducing JMH benchmark testing is highly necessary, but currently we don't need to run it through CI.
  • If possible, I suggest deleting the code for generating reports and analyzing results, and only keeping the JMH classes.

Please refer to the above suggestions and make appropriate modifications to the PR content. After that, we will vote on this PR together with other reviewers ASAP.

@GOODBOY008
Copy link
Member Author

Hi @delei
Thanks for your feedback.

For the first point, I understand the concern about the PR size — my intention was to split the work into stages, so this submission might look a bit large.

Regarding the second and third points:
• Running benchmarks in CI helps produce relatively stable and reproducible results. Running them locally is often influenced by background tasks and can take a long time.
• As for report generation and analysis, they make it easier to compare multiple runs, especially when evaluating different scenarios. Doing this entirely by hand would be quite time-consuming.

I’m fine with keeping only the JMH core classes for now, but I’d like to highlight the above considerations.

@GOODBOY008 GOODBOY008 force-pushed the feat/benchmark-comparison-workflow branch from 5a95c2d to 1be72cb Compare September 23, 2025 07:50
@GOODBOY008
Copy link
Member Author

@delei PTAL

@GOODBOY008 GOODBOY008 force-pushed the feat/benchmark-comparison-workflow branch from 1be72cb to 1824158 Compare September 26, 2025 08:49
@delei delei added PR: require-multiple-approvals This pull request requires multiple approvals. and removed enhancement New feature or request PR: developing This feature will be added in future releases discussion welcome Welcome to join the discussion together labels Oct 1, 2025
@GOODBOY008 GOODBOY008 force-pushed the feat/benchmark-comparison-workflow branch from 1824158 to a828caf Compare November 24, 2025 07:23
@netlify
Copy link

netlify bot commented Nov 24, 2025

Deploy Preview for fesod ready!

Name Link
🔨 Latest commit 3a2c467
🔍 Latest deploy log https://app.netlify.com/projects/fesod/deploys/69240bb276a415000840ef16
😎 Deploy Preview https://deploy-preview-575--fesod.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@GOODBOY008 GOODBOY008 force-pushed the feat/benchmark-comparison-workflow branch from a828caf to 3a2c467 Compare November 24, 2025 07:39
@GOODBOY008 GOODBOY008 force-pushed the feat/benchmark-comparison-workflow branch from 3a2c467 to 4ede42b Compare January 16, 2026 05:46
Copilot AI review requested due to automatic review settings January 16, 2026 05:46
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a comprehensive JMH-based benchmark module for the FastExcel library, enabling performance testing and comparisons with Apache POI across various operations (read, write, fill) and dataset sizes. The module includes memory profiling capabilities, test data generation utilities, and comparison benchmarks to validate FastExcel's performance claims.

Changes:

  • New fesod-benchmark module with complete JMH integration and Maven configuration
  • Benchmark suites for read, write, and fill operations across multiple dataset sizes and file formats
  • Memory profiling utilities with GC tracking and detailed statistics
  • Comparison benchmarks between FastExcel and Apache POI
  • Comprehensive test data generation with configurable characteristics

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 20 comments.

Show a summary per file
File Description
pom.xml Added fesod-benchmark module to parent POM
fesod-benchmark/pom.xml New Maven configuration with JMH dependencies and shade plugin
fesod-benchmark/benchmark.md Documentation for running and interpreting benchmarks
MemoryProfiler.java Utility for real-time memory profiling with GC tracking
DataGenerator.java Test data generation with multiple data types and characteristics
BenchmarkFileUtil.java File management utilities for benchmark operations
BenchmarkData.java Data model with 20 fields covering various Excel data types
BenchmarkConfiguration.java Configuration enums for dataset sizes and file formats
AbstractBenchmark.java Base class providing common benchmark functionality
WriteBenchmark.java Write operation benchmarks for different sizes and scenarios
ReadBenchmark.java Read operation benchmarks with multiple listener patterns
FillBenchmark.java Template fill operation benchmarks
FastExcelVsPoiBenchmark.java Comparison benchmarks between FastExcel and Apache POI
ComparisonBenchmarkRunner.java Runner for executing comparison benchmarks

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// Test files for different sizes and formats
private String xlsxSmallFile;
private String xlsxMediumFile;
private String xlsEXTRA_LARGEFile;
Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable name xlsEXTRA_LARGEFile uses inconsistent capitalization mixing underscores with camelCase. Should be xlsxLargeFile to match the naming pattern of other variables like xlsxSmallFile and xlsxMediumFile.

Suggested change
private String xlsEXTRA_LARGEFile;
private String xlsxLargeFile;

Copilot uses AI. Check for mistakes.
}

@Benchmark
public void readXlsEXTRA_LARGE(Blackhole blackhole) throws Exception {
Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method name readXlsEXTRA_LARGE uses inconsistent capitalization with underscores and uppercase. Should be readXlsxLarge to follow Java naming conventions and match the pattern of other methods like readXlsxSmall and readXlsxMedium.

Suggested change
public void readXlsEXTRA_LARGE(Blackhole blackhole) throws Exception {
public void readXlsxLarge(Blackhole blackhole) throws Exception {

Copilot uses AI. Check for mistakes.

// Stream reading benchmarks
@Benchmark
public void readXlsEXTRA_LARGEWithStreaming(Blackhole blackhole) throws Exception {
Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method name readXlsEXTRA_LARGEWithStreaming uses inconsistent capitalization. Should be readXlsxLargeWithStreaming to follow Java naming conventions.

Suggested change
public void readXlsEXTRA_LARGEWithStreaming(Blackhole blackhole) throws Exception {
public void readXlsxLargeWithStreaming(Blackhole blackhole) throws Exception {

Copilot uses AI. Check for mistakes.

// Different listener types benchmarks
@Benchmark
public void readXlsEXTRA_LARGECountingOnly(Blackhole blackhole) throws Exception {
Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method name readXlsEXTRA_LARGECountingOnly uses inconsistent capitalization. Should be readXlsxLargeCountingOnly to follow Java naming conventions.

Suggested change
public void readXlsEXTRA_LARGECountingOnly(Blackhole blackhole) throws Exception {
public void readXlsxLargeCountingOnly(Blackhole blackhole) throws Exception {

Copilot uses AI. Check for mistakes.
}

@Benchmark
public void readXlsEXTRA_LARGECollecting(Blackhole blackhole) throws Exception {
Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method name readXlsEXTRA_LARGECollecting uses inconsistent capitalization. Should be readXlsxLargeCollecting to follow Java naming conventions.

Suggested change
public void readXlsEXTRA_LARGECollecting(Blackhole blackhole) throws Exception {
public void readXlsxLargeCollecting(Blackhole blackhole) throws Exception {

Copilot uses AI. Check for mistakes.
"Generated {} rows in {} ms ({} rows/sec)",
rowCount,
duration,
duration > 0 ? (rowCount * 1000 / duration) : "N/A");
Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential overflow in int multiplication before it is converted to long by use in a numeric context.

Suggested change
duration > 0 ? (rowCount * 1000 / duration) : "N/A");
duration > 0 ? (rowCount * 1000L / duration) : "N/A");

Copilot uses AI. Check for mistakes.
protected void setupBenchmark() throws Exception {
// Custom setup logic if needed
}

Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method overrides AbstractBenchmark.tearDownBenchmark; it is advisable to add an Override annotation.

Suggested change
@Override

Copilot uses AI. Check for mistakes.
Comment on lines +161 to +165

protected void setupBenchmark() throws Exception {
// Custom setup logic if needed
}

Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method overrides AbstractBenchmark.setupBenchmark; it is advisable to add an Override annotation.

Suggested change
protected void setupBenchmark() throws Exception {
// Custom setup logic if needed
}
@Override
protected void setupBenchmark() throws Exception {
// Custom setup logic if needed
}
@Override

Copilot uses AI. Check for mistakes.

System.out.printf("Setup comparison benchmark: %s format, %d rows%n", fileFormat, rowCount);
}

Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method overrides AbstractBenchmark.tearDownTrial; it is advisable to add an Override annotation.

Suggested change
@Override

Copilot uses AI. Check for mistakes.
private List<BenchmarkData> testDataList;
private MemoryProfiler memoryProfiler;
private List<ComparisonResult> localResults = new ArrayList<>();

Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method overrides AbstractBenchmark.setupTrial; it is advisable to add an Override annotation.

Suggested change
@Override

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

PR: require-multiple-approvals This pull request requires multiple approvals.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Proposal: Introducing FastExcel Benchmark Performance Testing Module

3 participants