A production-grade Kotlin-based parser for NEM12 format energy meter reading files. The parser reads NEM12 files, validates their contents, and stores meter readings and parsing failures in a database.
Key Features:
- Best-Effort Parsing: Continues processing even when individual records fail validation
- Dual Failure Handling: Saves failures to database AND logs them to console in real-time
- Batch Processing: Optimized batch inserts for high performance
- Timezone Conversion: Automatically converts AEST/AEDT timestamps to UTC
- Test code: Wrote test code for server stability.
What it does:
- Reads NEM12 format files line-by-line
- Validates each record against NEM12 specifications
- Converts meter readings to UTC timezone
- Stores valid readings in
meter_readingtable - Stores failed records in
failure_readingtable with detailed error information - Logs all failures to console for real-time monitoring
Runtime Dependencies:
- JDK 21: Java Development Kit
- Kotlin: Modern JVM language with null-safety and type inference
- SQLite: Embedded database for data storage
Development Dependencies:
- Gradle: Build automation tool
- JUnit: Testing framework
- Ktlint: Kotlin code style checker and formatter
- JDK 21 or higher installed
- No additional software required (SQLite is embedded)
# Clone the repository
cd nem12-parser
# Build the project (runs tests automatically)
./gradlew clean buildBasic Usage:
java -jar build/libs/nem12-parser-1.0.0-standalone.jar <input-file> <output-database>Examples:
# Parse a NEM12 file and store results in output.db
java -jar build/libs/nem12-parser-1.0.0-standalone.jar ./src/test/resources/sample.nem12 output.db
# Custom batch size (default: 50)
java -jar build/libs/nem12-parser-1.0.0-standalone.jar ./src/test/resources/sample.nem12 output.db --batch-size=500# Run with default test file
./gradlew run
# Run with custom arguments
./gradlew run --args="input.csv output.db"Query the database:
# Open SQLite database
sqlite3 output.db
# View successful meter readings
SELECT * FROM meter_reading LIMIT 10;
# View failed records with reasons
SELECT line_number, failure_reason, nmi, raw_value
FROM failure_reading
ORDER BY line_number;# Run all tests
./gradlew test# Auto-format code
./gradlew ktlintFormatAfter running the parser, you'll find:
<output-database>.db: SQLite database with two tables:meter_reading: Successfully parsed meter readingsfailure_reading: Failed records with error details
INFO - Starting to parse file: sample.nem12
WARN - Parsing failure - Line 15: NEGATIVE_VALUE (NMI: 1234567890, Interval: 5, Time: 2024-01-01T12:00, Raw: '-10.5')
INFO - Successfully parsed 1523 lines
INFO - Parsing completed successfully
Database created: output.db
Failed records:
NEGATIVE_VALUE: 2
EMPTY_VALUE: 5
INTERVAL_COUNT_MISMATCH: 1
Failed records database: output.db
The NEM12 Parser follows a Layered Architecture pattern with clear separation of concerns across three main layers:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Main (CLI) β
β - Command-line argument parsing β
β - Dependency injection setup β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
βββββββββββββ΄ββββββββββββ
βΌ βΌ
ββββββββββββββββββββ ββββββββββββββββββββ
β Failure Handler β β Parser Service β
β (Composite) ββββββ (NEM12Parser) β
ββββββββββββββββββββ€ ββββββββββββββββββββ€
β - Database β β - File reading β
β - Logging β β - State machine β
β - (Extensible) β β - Validation β
ββββββββββββββββββββ ββββββββββ¬ββββββββββ
β
ββββββββββββββ΄βββββββββββββ
βΌ βΌ
ββββββββββββββββββββ ββββββββββββββββββββ
β Record Parser β β Repository β
β Service β β (Data Access) β
ββββββββββββββββββββ€ ββββββββββββββββββββ€
β - Interval data β β - Meter reading β
β - Validation β β - Failure record β
β - Failure notify β β - Batch insert β
ββββββββββββββββββββ ββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββ
β SQLite Database β
β - meter_reading β
β - failure_readingβ
ββββββββββββββββββββ
Main.kt - Application entry point
- Parses command-line arguments
- Creates and wires dependencies
- Orchestrates the parsing workflow
- Displays statistics and results
NEM12ParserService - Main parsing orchestration
- Reads NEM12 file line-by-line
- Maintains parser state
- Delegates interval data parsing to RecordParserService
- Saves valid readings to repository
RecordParserService - Interval data parsing and validation
- Parses 300 (interval data) records
- Checks interval count against expected count
- Notifies FailureHandler for invalid records
- Returns list of valid MeterReading objects
FailureHandler (Interface) - Defines failure handling contract
Implementations:
- DatabaseFailureHandler - Persists failures to SQLite
- LoggingFailureHandler - Logs failures to console
- CompositeFailureHandler - Combines multiple handlers (Composite Pattern)
Benefits:
- Easy to add new handlers (e.g., EmailHandler, MetricsHandler)
- Separation of concerns
- Testability through interfaces
BaseSQLiteRepository
- Handles batch processing
- Provides timezone conversion utility (AEST β UTC)
- Implements common operations
Concrete Implementations:
-
MeterReadingRepositoryImpl
- Stores valid meter readings
- Generates UUID for each record
- Tracks total inserted count
-
FailureReadingsRepositoryImpl
- Stores failed parsing records
- Tracks statistics by failure reason
- Supports nullable fields (timestamp, interval index)
| Pattern | Usage | Benefit |
|---|---|---|
| Layered Architecture | Handler-Service-Repository | Separation of concerns, testability |
| Template Method | BaseSQLiteRepository | Code reuse, consistent behavior |
| Composite | CompositeFailureHandler | Combine multiple handlers flexibly |
| Strategy | FailureHandler implementations | Swap handling strategies at runtime |
| State Machine | ParserState | Track NEM12 file structure hierarchy |
| Dependency Injection | Constructor injection | Loose coupling, testability |
meter_reading table:
CREATE TABLE meter_reading (
id TEXT PRIMARY KEY, -- UUID
nmi VARCHAR(10) NOT NULL, -- Meter identifier
timestamp TIMESTAMP NOT NULL, -- UTC timestamp
consumption NUMERIC NOT NULL, -- Energy consumption (15.4 format)
UNIQUE(nmi, timestamp) -- Prevent duplicates
);failure_reading table:
CREATE TABLE failure_reading (
id TEXT PRIMARY KEY, -- UUID
line_number INTEGER NOT NULL, -- Source line in input file
nmi TEXT, -- Meter identifier
interval_index INTEGER, -- Interval position
raw_value TEXT NOT NULL, -- Original invalid value
failure_reason TEXT NOT NULL, -- Reason enum
timestamp TIMESTAMP -- Timestamp
);Decision: Streaming (line-by-line) approach
// Using BufferedReader with lineSequence()
cmd.inputPath.bufferedReader().use { reader ->
reader.lineSequence().forEach { line ->
parseLine(line.trim())
}
}Why:
- No need to load entire file into memory
Decision: Continue parsing even when individual records fail
failureHandler.use {
for (i in 0 until expectedIntervals) {
if (!isValid(value)) {
failureHandler.handleFailure(record) // Log and continue
continue
}
readings.add(validReading)
}
}Why:
- Maximize data extraction from partially corrupted files
- Better user experience (get some data vs. nothing)
- Detailed failure tracking for debugging
Alternative considered:
- Fail-fast approach (stop on first error)
- Rejected because: Real-world files often have isolated errors
Decision: Buffer records and insert in batches
fun save(entity: T) {
insertStatement.addBatch()
batchCount++
if (batchCount >= batchSize) {
executeBatch() // Execute when batch is full
}
}Why: Performance
- Reduces database I/O operations
- Efficient use of database connection
Decision: Convert all timestamps to UTC before storage
fun aestToUtc(timestamp: LocalDateTime): LocalDateTime {
return timestamp.atZone(AEST)
.withZoneSameInstant(UTC)
.toLocalDateTime()
}Why:
- DST handling: Automatically handles AEST β AEDT transitions
- International compatibility: UTC is standard for data storage
from Shishir
Input date timezone is AEST, UTC+10:00, and can be stored in the database as UTC
Decision: Multiple failure handlers combined via CompositeFailureHandler
val databaseHandler = DatabaseFailureHandler(repository)
val loggingHandler = LoggingFailureHandler()
val compositeHandler = CompositeFailureHandler(databaseHandler, loggingHandler)Why:
- Flexibility: Enable/disable handlers independently
- Extensibility: Easy to add new handlers (email, metrics, etc.)
- Single Responsibility: Each handler does one thing
Tool: Claude
Used AI for architectural decision validation before implementation.
Example:
- Validated Repository and Composite patterns
Tool: Claude Code
AI assisted with code generation and Kotlin idioms:
- Generated BaseSQLiteRepository structure
- Implemented timezone conversion logic
- Created test scaffolding
Tool: Claude Bot + GitHub Actions
Set up AI-powered code review on pull requests.(Sample)
Impact: Instant feedback and validate code quality
Tool: Google NotebookLM
Analyzed NEM12 specification documents to extract requirements.
Process:
- Uploaded NEM12 spec PDFs to NotebookLM
- Asked questions about record types and validation rules
- Generated summary of key requirements
This project was managed using GitHub Issues and Pull Requests to track tasks, document decisions, and maintain code quality through systematic review processes.
GitHub Issues - Used for:
- Feature planning and requirements tracking
- Bug tracking and resolution
- Design decision documentation
Pull Requests - Used for:
- Code review and quality assurance
- Feature integration
- Issue-driven development: Each feature/fix linked to a specific issue
- Branch strategy: Feature branches merged via PR
- Code review: All changes reviewed before merging
- Automated testing: CI/CD pipeline validates every PR
- Clear commit messages: Descriptive commits referencing issues
- Closed Issues: View all completed tasks
- Closed Pull Requests: View all merged changes