Contributing Guide

Thank you for considering contributing to async2databricks! This document provides guidelines for contributing to the project.

Development Setup

Prerequisites

Java 11 or later
SBT 1.9.7
Docker and Docker Compose
Git

Getting Started

Fork the repository

Clone your fork:

git clone https://github.com/YOUR_USERNAME/async2databricks.git
cd async2databricks

Set up the development environment:
```
docker compose up -d
sbt compile
```

Development Workflow

1. Create a Feature Branch

git checkout -b feature/your-feature-name

2. Make Your Changes

Follow the project structure:

src/
├── main/
│   ├── scala/com/async2databricks/
│   │   ├── config/      # Configuration models
│   │   ├── database/    # Database access layer
│   │   ├── etl/         # ETL pipeline logic
│   │   ├── model/       # Domain models
│   │   └── s3/          # S3 writer
│   └── resources/
│       └── application.conf  # Configuration
└── test/
    └── scala/com/async2databricks/  # Unit tests

3. Write Tests

All new functionality should include tests:

package com.async2databricks.yourpackage

import org.scalatest.flatspec.AnyFlatSpec
import org.scalatest.matchers.should.Matchers

class YourSpec extends AnyFlatSpec with Matchers {
  "YourClass" should "do something" in {
    // test implementation
  }
}

Run tests:

sbt test

4. Follow Code Style

The project uses standard Scala conventions:

Use 2 spaces for indentation
Line length: 120 characters
Use meaningful variable names
Add scaladoc comments for public APIs

Format your code:

sbt scalafmt

5. Commit Your Changes

Write clear commit messages:

git add .
git commit -m "feat: add support for incremental loads

- Add watermark tracking
- Implement checkpoint mechanism
- Update tests"

Follow conventional commits:

feat: for new features
fix: for bug fixes
docs: for documentation
test: for test changes
refactor: for refactoring

6. Push and Create PR

git push origin feature/your-feature-name

Then create a Pull Request on GitHub.

Code Guidelines

Functional Programming

This project uses functional programming with Cats Effect:

// Good: Pure functional code
def loadData[F[_]: Async](config: Config): F[List[Data]] = {
  for {
    conn <- createConnection(config)
    data <- fetchData(conn)
  } yield data
}

// Avoid: Imperative code with side effects
def loadData(config: Config): List[Data] = {
  val conn = createConnection(config)  // side effect
  fetchData(conn)
}

Error Handling

Use Either, Option, or effect types for error handling:

// Good
def parse(input: String): Either[ParseError, Result] = ???

// Avoid
def parse(input: String): Result = {
  if (invalid) throw new Exception("Invalid")
  else result
}

Type Safety

Leverage Scala's type system:

// Good: Type-safe configuration
case class DatabaseConfig(
  url: String,
  user: String,
  password: String,
  poolSize: Int
)

// Avoid: Stringly-typed configuration
def getConfig(key: String): String = ???

Resource Management

Always use Resource for managing resources:

// Good
def createConnection[F[_]: Async]: Resource[F, Connection] = {
  Resource.make(acquire)(release)
}

// Avoid
def createConnection[F[_]: Async]: F[Connection] = {
  acquire // no cleanup
}

Testing Guidelines

Unit Tests

Test individual components in isolation:

"DataRepository" should "stream data correctly" in {
  val repo = DataRepository(transactor)
  val result = repo.streamData("SELECT * FROM test", 100)
    .compile
    .toList
  
  result should have size 10
}

Integration Tests

Test interactions between components. Use Docker for integration tests.

Test Coverage

Aim for:

Core business logic: 80%+ coverage
Configuration: 70%+ coverage
Integration points: Test happy path and error cases

Documentation

Code Documentation

Add scaladoc for public APIs:

/**
 * Repository for accessing data from the database.
 *
 * @tparam F the effect type
 */
trait DataRepository[F[_]] {
  /**
   * Stream data from the database.
   *
   * @param query the SQL query to execute
   * @param batchSize the number of records to fetch at once
   * @return a stream of SampleData
   */
  def streamData(query: String, batchSize: Int): Stream[F, SampleData]
}

README Updates

Update README.md if you:

Add new features
Change configuration
Modify deployment process
Add dependencies

Architecture Decisions

For significant changes, document the decision:

Create docs/adr/ directory if it doesn't exist
Add NNN-decision-title.md with:
- Context
- Decision
- Consequences

Pull Request Process

Update Documentation: Ensure README and other docs are current
Add Tests: All new code must have tests
Pass CI: All tests and checks must pass
Update Changelog: Add entry to CHANGELOG.md
Request Review: Tag maintainers for review

PR Description Template

## Description
Brief description of changes

## Motivation
Why is this change needed?

## Changes
- List of changes

## Testing
How was this tested?

## Checklist
- [ ] Tests added/updated
- [ ] Documentation updated
- [ ] Changelog updated
- [ ] CI passing

Release Process

Maintainers will:

Update version in build.sbt
Update CHANGELOG.md
Create git tag
Publish release

Getting Help

Issues: Open an issue on GitHub
Discussions: Use GitHub Discussions
Questions: Tag your issue with question

License

By contributing, you agree that your contributions will be licensed under the project's license (see LICENSE file).

Code of Conduct

Be respectful and inclusive
Welcome newcomers
Focus on constructive feedback
Follow the Scala Code of Conduct

Thank You!

Your contributions make this project better for everyone. Thank you for taking the time to contribute!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributing Guide

Development Setup

Prerequisites

Getting Started

Development Workflow

1. Create a Feature Branch

2. Make Your Changes

3. Write Tests

4. Follow Code Style

5. Commit Your Changes

6. Push and Create PR

Code Guidelines

Functional Programming

Error Handling

Type Safety

Resource Management

Testing Guidelines

Unit Tests

Integration Tests

Test Coverage

Documentation

Code Documentation

README Updates

Architecture Decisions

Pull Request Process

PR Description Template

Release Process

Getting Help

License

Code of Conduct

Thank You!

FilesExpand file tree

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

Contributing Guide

Development Setup

Prerequisites

Getting Started

Development Workflow

1. Create a Feature Branch

2. Make Your Changes

3. Write Tests

4. Follow Code Style

5. Commit Your Changes

6. Push and Create PR

Code Guidelines

Functional Programming

Error Handling

Type Safety

Resource Management

Testing Guidelines

Unit Tests

Integration Tests

Test Coverage

Documentation

Code Documentation

README Updates

Architecture Decisions

Pull Request Process

PR Description Template

Release Process

Getting Help

License

Code of Conduct

Thank You!