A Python utility to infer and report data types for columns in tabular files like CSVs and TSVs.
- Parses CSV and TSV files automatically.
- Infers common data types (integer, float, string, boolean, date) for each column.
- Identifies columns with mixed data types, reporting the types found.
- Reports unique value counts for categorical fields.
- Generates a summarized schema report to console, JSON, or CSV output.
-
Clone the repository:
git clone https://github.com/your-username/field-observer.git cd field-observer -
Install dependencies:
pip install -r requirements.txt
-
Install as an editable package (optional, for development):
pip install -e .
Field Observer provides a command-line interface (CLI) to analyze your data files.
To analyze a CSV or TSV file and print the report to the console:
field-observer analyze <path/to/your/file.csv>Example:
field-observer analyze data/sample.csvBy default, the tool attempts to infer the delimiter (comma for CSV, tab for TSV). If you need to explicitly specify it, use the -d or --delimiter option:
field-observer analyze data/custom_delimiter.txt --delimiter ';'You can specify the output format using the -o or --output option. Supported formats are console (default), json, and csv.
To save the schema report as a JSON file:
field-observer analyze data/sample.csv --output json > schema_report.jsonOr directly to a file:
field-observer analyze data/sample.csv --output json --file schema_report.jsonTo save the schema report as a CSV file:
bfield-observer analyze data/sample.csv --output csv > schema_report.csvOr directly to a file:
field-observer analyze data/sample.csv --output csv --file schema_report.csvAnalyzing file: data/sample.csv
--------------------------------------------------------------------------------
Column Name | Inferred Type | Mixed Types | Unique Count
--------------------------------------------------------------------------------
id | Integer | | 100
name | String | | 98
email | String | | 100
age | Integer | | 50
is_active | Boolean | | 2
registration_date | Date (YYYY-MM-DD) | | 80
price | Float | | 60
category | String | | 5
notes | String | None, String | 70
--------------------------------------------------------------------------------
Contributions are welcome! Please feel free to open issues or submit pull requests.
- Fork the repository.
- Create a new branch (
git checkout -b feature/your-feature-name). - Make your changes.
- Commit your changes (
git commit -am 'feat: Add some feature'). - Push to the branch (
git push origin feature/your-feature-name). - Open a Pull Request.