tb-php-s-surprising-performance-the-100-million-row-c-1

PHP 100 Million Row Benchmark: How Fast Is PHP Really?

Updated April 2026: PHP still gets dismissed as “slow” because of memories from shared hosting and early WordPress-era code. But the interesting question is not whether PHP can beat C, Rust, or Go in a microbenchmark. The practical question is whether modern PHP can process very large datasets predictably without exhausting memory, and what coding patterns make the difference.

To make that concrete, I revisited the idea behind the “100 million row challenge” with a reproducible PHP CLI benchmark. The result is simple but useful: on a modest 6 vCPU VPS, PHP 8.3 processed 100 million synthetic rows in under two seconds while keeping peak memory around 2 MB. That does not mean every PHP data job will be that fast. It does mean that PHP’s reputation is often worse than its real bottlenecks.

PHP 100 Million Row Benchmark: The Test Environment

The benchmark below was run from the command line, not through a web request. That distinction matters. Long-running batch jobs should use PHP CLI, queues, workers, or cron, not an HTTP request waiting behind a reverse proxy timeout.

ComponentValue
PHP versionPHP 8.3.30 CLI, non-thread-safe build
Opcache CLIEnabled for the run
CPU6 vCPU AMD EPYC-Milan
Memory7.7 GiB RAM available on the VPS
WorkloadStreaming synthetic row generation, checksum, min/max scan
StorageNo disk I/O in the timed loop

This is intentionally a CPU-bound streaming test. It answers one narrow question: can PHP loop through 100 million simple records and aggregate values without building a massive array in memory? For database-heavy or file-heavy workloads, MySQL query planning, indexes, disk I/O, network latency, and serialization formats will dominate the result.

The Reproducible PHP Benchmark Code

The important design choice is that the script does not store 100 million rows. It generates one value, processes it, and moves on. That is the same pattern you want when reading a CSV, scanning logs, processing a stream, or consuming a queue.

<?php
$sizes = [1_000_000, 10_000_000, 100_000_000];

foreach ($sizes as $n) {
    $start = microtime(true);
    $sum = 0;
    $min = PHP_INT_MAX;
    $max = PHP_INT_MIN;

    for ($i = 1; $i <= $n; $i++) {
        // Deterministic pseudo-row value. This avoids disk I/O so the test
        // measures PHP loop/aggregation overhead rather than storage speed.
        $v = ($i * 1103515245 + 12345) & 0x7fffffff;

        $sum += $v % 1000;
        if ($v < $min) $min = $v;
        if ($v > $max) $max = $v;
    }

    $elapsed = microtime(true) - $start;

    printf(
        "rows=%d seconds=%.3f rows_per_sec=%.0f peak_mb=%.2f checksum=%d min=%d max=%d\n",
        $n,
        $elapsed,
        $n / $elapsed,
        memory_get_peak_usage(true) / 1048576,
        $sum,
        $min,
        $max
    );
}

Run it with:

php -d opcache.enable_cli=1 benchmark.php

Benchmark Results: 100 Million Rows in PHP 8.3

Here are the measured results from the VPS run:

Rows processedTimeRows/secPeak memoryChecksum
1,000,0000.025 s40.4 million/s2.00 MB499,528,032
10,000,0000.196 s51.0 million/s2.00 MB4,995,025,504
100,000,0001.975 s50.6 million/s2.00 MB49,950,018,536

The headline number is less important than the shape of the result. Memory stayed flat because the code streamed values instead of accumulating rows. Throughput also scaled predictably from 10 million to 100 million rows. That is the lesson: modern PHP can handle large batch loops if the code avoids the classic memory traps.

Why This Matters for Developers and Engineers

Most PHP performance problems are not caused by the language being incapable of raw iteration. They come from architectural choices that multiply work:

  • Loading everything into arrays: a 100 million row array is not a benchmark; it is a memory incident.
  • N+1 database queries: the database round trip often costs more than the PHP calculation.
  • Unindexed filtering: asking PHP to compensate for missing database indexes is expensive.
  • Work inside web requests: batch jobs should run in CLI workers, queues, or scheduled tasks.
  • String-heavy parsing: CSV/JSON parsing can dominate runtime even when numeric aggregation is fast.

For WordPress and Laravel teams, this is a practical reminder. PHP is perfectly capable of serious background processing, but the code must be written like a data pipeline rather than a page template.

How to Make PHP Data Jobs Fast

If you need PHP to process millions of rows, the biggest wins usually come from boring engineering discipline:

  • Stream input: use generators, SplFileObject, database cursors, pagination, or queue batches.
  • Keep memory flat: aggregate as you go; do not collect every row unless the business logic truly requires it.
  • Move filtering to the database: use indexes and selective queries before PHP sees the data.
  • Batch writes: insert/update in chunks instead of one write per row.
  • Measure with realistic data: synthetic loops are useful, but production data formats and I/O change the bottleneck.
  • Use CLI workers: avoid browser/proxy/PHP-FPM request limits for long jobs.

PHP’s generators are especially useful for streaming. So are SPL tools such as SplFileObject. For database workloads, the best “PHP optimization” may be a better SQL query plan.

What This Benchmark Does Not Prove

This benchmark does not prove that PHP is the fastest language for data processing. It also does not measure JSON parsing, CSV reads, network calls, database joins, ORM hydration, image processing, or WordPress hooks. Those are common real-world bottlenecks, and each deserves its own benchmark.

It does prove something narrower and still valuable: PHP 8.3 can run a 100 million iteration streaming aggregation with predictable CPU time and tiny memory usage. If a PHP batch job collapses at scale, the first suspect should be data flow and architecture, not the language logo.

Related Reading

Key Takeaways

  • PHP 8.3 processed 100 million synthetic rows in 1.975 seconds in this CPU-bound CLI benchmark.
  • Peak memory stayed around 2 MB because the script streamed values instead of storing rows.
  • Large PHP jobs should run through CLI workers, queues, or cron — not web requests.
  • The biggest PHP performance wins usually come from streaming, indexing, batching, and avoiding N+1 queries.
  • For real production decisions, benchmark the actual workload: file format, database access, network calls, and framework overhead matter.

PHP is not magic, and it is not automatically fast. But with modern PHP, opcache, and streaming-oriented code, the language is far more capable than its old reputation suggests.

Scroll to Top