Thank you for considering contributing to async2databricks! This document provides guidelines for contributing to the project.
- Java 11 or later
- SBT 1.9.7
- Docker and Docker Compose
- Git
- Fork the repository
- Clone your fork:
git clone https://github.com/YOUR_USERNAME/async2databricks.git cd async2databricks - Set up the development environment:
docker compose up -d sbt compile
git checkout -b feature/your-feature-nameFollow the project structure:
src/
├── main/
│ ├── scala/com/async2databricks/
│ │ ├── config/ # Configuration models
│ │ ├── database/ # Database access layer
│ │ ├── etl/ # ETL pipeline logic
│ │ ├── model/ # Domain models
│ │ └── s3/ # S3 writer
│ └── resources/
│ └── application.conf # Configuration
└── test/
└── scala/com/async2databricks/ # Unit tests
All new functionality should include tests:
package com.async2databricks.yourpackage
import org.scalatest.flatspec.AnyFlatSpec
import org.scalatest.matchers.should.Matchers
class YourSpec extends AnyFlatSpec with Matchers {
"YourClass" should "do something" in {
// test implementation
}
}Run tests:
sbt testThe project uses standard Scala conventions:
- Use 2 spaces for indentation
- Line length: 120 characters
- Use meaningful variable names
- Add scaladoc comments for public APIs
Format your code:
sbt scalafmtWrite clear commit messages:
git add .
git commit -m "feat: add support for incremental loads
- Add watermark tracking
- Implement checkpoint mechanism
- Update tests"Follow conventional commits:
feat:for new featuresfix:for bug fixesdocs:for documentationtest:for test changesrefactor:for refactoring
git push origin feature/your-feature-nameThen create a Pull Request on GitHub.
This project uses functional programming with Cats Effect:
// Good: Pure functional code
def loadData[F[_]: Async](config: Config): F[List[Data]] = {
for {
conn <- createConnection(config)
data <- fetchData(conn)
} yield data
}
// Avoid: Imperative code with side effects
def loadData(config: Config): List[Data] = {
val conn = createConnection(config) // side effect
fetchData(conn)
}Use Either, Option, or effect types for error handling:
// Good
def parse(input: String): Either[ParseError, Result] = ???
// Avoid
def parse(input: String): Result = {
if (invalid) throw new Exception("Invalid")
else result
}Leverage Scala's type system:
// Good: Type-safe configuration
case class DatabaseConfig(
url: String,
user: String,
password: String,
poolSize: Int
)
// Avoid: Stringly-typed configuration
def getConfig(key: String): String = ???Always use Resource for managing resources:
// Good
def createConnection[F[_]: Async]: Resource[F, Connection] = {
Resource.make(acquire)(release)
}
// Avoid
def createConnection[F[_]: Async]: F[Connection] = {
acquire // no cleanup
}Test individual components in isolation:
"DataRepository" should "stream data correctly" in {
val repo = DataRepository(transactor)
val result = repo.streamData("SELECT * FROM test", 100)
.compile
.toList
result should have size 10
}Test interactions between components. Use Docker for integration tests.
Aim for:
- Core business logic: 80%+ coverage
- Configuration: 70%+ coverage
- Integration points: Test happy path and error cases
Add scaladoc for public APIs:
/**
* Repository for accessing data from the database.
*
* @tparam F the effect type
*/
trait DataRepository[F[_]] {
/**
* Stream data from the database.
*
* @param query the SQL query to execute
* @param batchSize the number of records to fetch at once
* @return a stream of SampleData
*/
def streamData(query: String, batchSize: Int): Stream[F, SampleData]
}Update README.md if you:
- Add new features
- Change configuration
- Modify deployment process
- Add dependencies
For significant changes, document the decision:
- Create
docs/adr/directory if it doesn't exist - Add
NNN-decision-title.mdwith:- Context
- Decision
- Consequences
- Update Documentation: Ensure README and other docs are current
- Add Tests: All new code must have tests
- Pass CI: All tests and checks must pass
- Update Changelog: Add entry to CHANGELOG.md
- Request Review: Tag maintainers for review
## Description
Brief description of changes
## Motivation
Why is this change needed?
## Changes
- List of changes
## Testing
How was this tested?
## Checklist
- [ ] Tests added/updated
- [ ] Documentation updated
- [ ] Changelog updated
- [ ] CI passingMaintainers will:
- Update version in
build.sbt - Update CHANGELOG.md
- Create git tag
- Publish release
- Issues: Open an issue on GitHub
- Discussions: Use GitHub Discussions
- Questions: Tag your issue with
question
By contributing, you agree that your contributions will be licensed under the project's license (see LICENSE file).
- Be respectful and inclusive
- Welcome newcomers
- Focus on constructive feedback
- Follow the Scala Code of Conduct
Your contributions make this project better for everyone. Thank you for taking the time to contribute!