Performance

elek is designed for high-performance CSV processing. It uses code generation, compiled validators, and bitmap-based error tracking to achieve excellent throughput even with large datasets.

Benchmark Results

Tested on 100,000 rows x 8 columns:

Operation	Time	Throughput	Notes
Parse	~37ms	2.7M rows/sec	Code-generated parser
Validate (compiled)	~30ms	3.3M rows/sec	No locale, 5 rules/column
Validate (with locale)	~72ms	1.4M rows/sec	Turkish date/number parsing
Full pipeline	~76ms	1.3M rows/sec	Parse + map + validate
Column mapping	~2ms	50M cols/sec	8 columns, fuzzy matching

Memory Usage

Standard Validation

With standard validate(), memory grows with error count:

Scenario	Memory
No errors	~1KB (result object)
1% error rate	~8MB (100K * 0.01 * 8 cols * 100 bytes/error)
10% error rate	~80MB

Bitmap Validation

With validateBitmap(), memory is fixed regardless of error count:

Dataset Size	Bitmap	Codes	Total
10K x 8	10KB	80KB	~90KB
100K x 8	100KB	800KB	~900KB
1M x 8	1MB	8MB	~9MB

Formula: rows * cols * 1.125 bytes

Compiled Validators

Schema compilation generates optimized JavaScript functions:

// Before compilation (conceptual)
function validateCell(value, rules) {
  for (const rule of rules) {  // Loop overhead
    if (rule.type === "required" && value === "") return error;
    if (rule.type === "email" && !emailRegex.test(value)) return error;
    // ...
  }
}

// After compilation (generated)
function validateName(v) {
  if (v === '') return 1;  // Inlined, no loop
  return 0;
}

function validateEmail(v) {
  if (v === '') return 1;
  if (!/^[^\s@]+@[^\s@]+\.[^\s@]+$/.test(v)) return 10;
  return 0;
}

Performance Tips

1. Use Bitmap Validation for Large Files

import { validateBitmap } from "@elekcsv/core";

// Switch to bitmap for 10K+ rows
const result = data.length > 10000
  ? validateBitmap(data, schema)
  : validate(data, schema);

2. Reuse Compiled Validators

import { CompiledValidator } from "@elekcsv/core";

// Compile once
const validator = new CompiledValidator(schema);

// Reuse for multiple files
const result1 = validator.validateAllBitmap(data1);
const result2 = validator.validateAllBitmap(data2);

3. Skip Locale When Not Needed

// Slower: locale parsing on every cell
const schema1 = {
  locale: "tr",
  columns: {
    name: { type: "string" },  // Still uses Turkish comparison
  },
};

// Faster: no locale
const schema2 = {
  columns: {
    name: { type: "string" },  // No locale overhead
  },
};

4. Use Specific Column Types

// "string" is fastest - no validation
{ type: "string" }

// "integer" is fast - simple regex
{ type: "integer" }

// "date" with locale is slower - format parsing
{ type: "date", locale: "tr" }

5. Limit Error Pagination

const result = validateBitmap(data, schema);

// Don't: Materialize all errors at once
const allErrors = result.getErrors({ limit: Infinity });

// Do: Paginate as needed
const page1 = result.getErrors({ limit: 50, offset: 0 });
const page2 = result.getErrors({ limit: 50, offset: 50 });

6. Use Preview for Large Files

// Validate a preview first
const preview = data.slice(0, 100);
const previewResult = validate(preview, schema);

if (previewResult.errors.length > 0) {
  // Show preview errors before processing full file
  showErrors(previewResult.errors);
  return;
}

// Process full file
const fullResult = validateBitmap(data, schema);

Parser Performance

The parser uses code generation for each CSV structure:

// First parse: compiles parser (~1ms)
const result1 = parse(csv1);

// Subsequent parses with same column count: cached (~0.1ms)
const result2 = parse(csv2);  // Reuses compiled parser

Parser Cache

import { clearParserCache } from "@elekcsv/core";

// Clear when parsing many different CSV structures
// to prevent memory accumulation
clearParserCache();

Profiling Your Pipeline

import { parse, mapColumns, applyMapping, validateBitmap } from "@elekcsv/core";

async function profilePipeline(csv: string) {
  const t0 = performance.now();

  const { headers, rows } = parse(csv, { header: true });
  const t1 = performance.now();
  console.log(`Parse: ${(t1 - t0).toFixed(2)}ms`);

  const mapping = mapColumns(headers!, schema);
  const t2 = performance.now();
  console.log(`Map: ${(t2 - t1).toFixed(2)}ms`);

  const mapped = applyMapping([headers!, ...rows], mapping.mappings, schema);
  const t3 = performance.now();
  console.log(`Apply: ${(t3 - t2).toFixed(2)}ms`);

  const result = validateBitmap(mapped.slice(1), schema);
  const t4 = performance.now();
  console.log(`Validate: ${(t4 - t3).toFixed(2)}ms`);

  console.log(`Total: ${(t4 - t0).toFixed(2)}ms`);
  console.log(`Rows: ${rows.length}`);
  console.log(`Throughput: ${(rows.length / (t4 - t0) * 1000).toFixed(0)} rows/sec`);

  return result;
}

Comparison with Alternatives

Library	Parse 100K	Validate	Column Mapping	Locale
elek	37ms	30ms	Built-in	Built-in
PapaParse	45ms	N/A	N/A	N/A
uDSV	25ms	N/A	N/A	N/A
csv-parse	120ms	N/A	N/A	N/A

elek is the only library that provides validation, column mapping, and locale support out of the box.

Web Workers

For very large files, you can offload parsing and validation to a Web Worker to keep the main thread responsive:

import { createWorkerClient } from "@elekcsv/core";

const worker = createWorkerClient({
  workerUrl: "/dist/worker.js"
});

// Process in background - UI stays responsive
const result = await worker.parseAndValidate(csvContent, schema, {
  maxRows: 100000
});

worker.terminate();

Worker Setup

Add the worker to your build or serve it statically:

// Vite: public/worker.js
importScripts("/dist/worker.js");

Or import directly in browsers that support it:

const worker = new Worker(
  new URL("@elekcsv/core/worker.js", import.meta.url),
  { type: "module" }
);

const client = createWorkerClient({ worker });

When to Use Workers

File Size	Recommended
< 10K rows	Main thread (default)
10K - 100K rows	Consider worker
> 100K rows	Use worker

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance

Benchmark Results

Memory Usage

Standard Validation

Bitmap Validation

Compiled Validators

Performance Tips

1. Use Bitmap Validation for Large Files

2. Reuse Compiled Validators

3. Skip Locale When Not Needed

4. Use Specific Column Types

5. Limit Error Pagination

6. Use Preview for Large Files

Parser Performance

Parser Cache

Profiling Your Pipeline

Comparison with Alternatives

Web Workers

Worker Setup

When to Use Workers

FilesExpand file tree

performance.md

Latest commit

History

performance.md

File metadata and controls

Performance

Benchmark Results

Memory Usage

Standard Validation

Bitmap Validation

Compiled Validators

Performance Tips

1. Use Bitmap Validation for Large Files

2. Reuse Compiled Validators

3. Skip Locale When Not Needed

4. Use Specific Column Types

5. Limit Error Pagination

6. Use Preview for Large Files

Parser Performance

Parser Cache

Profiling Your Pipeline

Comparison with Alternatives

Web Workers

Worker Setup

When to Use Workers