DataProfiler.jl provides a One-command, human-friendly DataFrame profiler in Julia.
It is simply tedious to write the same handful of lines every time you pick up a new dataset—compute summaries, check missingness, inspect distributions, make a few quick plots—all just to get oriented. DataProfiler.jl automates the repetitive early steps in data exploration and helps you focus on the real analysis instead.
Yes, tools like ydata-profiling make this easier in Python, but Julia has lacked a native alternative. DataProfiler.jl fills that gap by offering a interface built directly on top of Core Julia, DataFrames, and StatsBase. And because it is written in pure Julia, it can take full advantage of the performance and composability of the Julia ecosystem.
📄 Full documentation is available here.
The package can be installed with the Julia package manager. From the Julia REPL, type ] to enter the Pkg REPL mode and run:
pkg> add https://github.com/aryahassibi/DataProfiler.jl
Or, equivalently, via the Pkg API:
julia> import Pkg; Pkg.add("https://github.com/aryahassibi/DataProfiler.jl")DataProfiler.jl supports Julia 1.10 and later.
- Make sure the project is activated (see Installation).
- Start Julia and run the snippet below:
using DataProfiler, DataFrames, Random
Random.seed!(1)
df = DataFrame(
a = randn(200),
b = rand(1:5, 200),
c = rand(["2025-09-01", "x", "y"], 200),
)
report = profile(df; sample_rows = 150, maxplots = 4)
save_report(report, "report.md")- Inspect the generated
report.md(and the ASCII plots embedded inside). Any PNG charts, if CairoMakie is available, are saved toprofile_artifacts/beside the report.
CairoMakie is needed for PNG plots. You can either rely on the ASCII plots, or install CairoMakie.
(using Pkg; Pkg.add("CairoMakie"))
Run pkg> test inside the project to verify everything after making changes.
Warning
PNG boxplots need StatsMakie (which may not resolve on Julia 1.11)
- Data diagnostics: missingness overview, duplicate detection, semantic hints for ID/date/categorical columns
- Numeric analysis: summary statistics, quantiles, skewness/kurtosis, MAD/IQR outlier detection
- Visual summaries: ASCII histograms and boxplots via UnicodePlots; optional PNG plots via CairoMakie
- Non-destructive design: never mutates input DataFrames; sampling avoids expensive per-column work
- Headless-friendly: plotting and summaries work in scripts, terminals, and CI environments
Active development happens on the develop branch.
Releases are merged into main and tagged.
git clone https://github.com/aryahassibi/DataProfiler.jl.git
cd DataProfiler.jl
julia --project=.julia> ] instantiate
julia> ] testjulia --project=docs docs/make.jlContributions are very welcome! ❤️
