Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ target_include_directories(MPIToLLVMConversion


# Add executable and link it with the parser library
add_llvm_executable(dhir-opt main.cc libs/avialDialect.cc libs/avialOps.cc libs/utils.cc DEPENDS MLIRAvialDialectIncGen)
add_llvm_executable(dhir-opt main.cc libs/dhirDialect.cc libs/dhirOps.cc libs/utils.cc DEPENDS MLIRDhirDialectIncGen)
target_link_libraries(dhir-opt

PRIVATE
Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# 🥗 Avial : An MLIR Dialect for Distributed Heterogeneous Computing
Avial is a compiler infrastructure built using MLIR that enables efficient execution of programs across distributed and heterogeneous computing systems (CPU, GPU, cluster). Avial introduces a novel task-centric intermediate representation (IR) where tasks are first-class citizens, capturing their parallelism, device targets, and interdependencies.
# 🥗 DHIR : An MLIR Dialect for Distributed Heterogeneous Computing
DHIR is a compiler infrastructure built using MLIR that enables efficient execution of programs across distributed and heterogeneous computing systems (CPU, GPU, cluster). DHIR introduces a novel task-centric intermediate representation (IR) where tasks are first-class citizens, capturing their parallelism, device targets, and interdependencies.
## 🚧 Project Status
![Current Focus](https://img.shields.io/badge/Current_Focus-Topology_Aware_Optimizations-blue) <br>
![Next Release](https://img.shields.io/badge/Next_Release-Multicore_+_GPU_+_Topology_Aware_Scheduling-white) <br>
Expand All @@ -15,7 +15,7 @@ Parallel programming is notoriously difficult. Developers must reason about conc

Unifying these paradigms into a single coherent programming or compilation model is non-trivial due to fundamental differences in their memory models, synchronization semantics, and communication mechanisms. While there have been commendable efforts at unifying heterogeneous computing within a node. Such as OpenCL, OpenACC, and more recently Mojo. There is a noticeable gap when it comes to extending these unifications across distributed environments. The gap remains largely due to the complexity of distributed computing: issues such as explicit data movement between the nodes and network topology cannot be abstracted away as easily.

## Why Avial Is Unique?
## Why DHIR Is Unique?

While MLIR includes dialects like omp for shared-memory parallelism and gpu for targeting accelerators such as CUDA or ROCm, there is currently no dialect that provides a unified abstraction for distributed heterogeneous computing, that is, for clusters of nodes with diverse compute units like CPUs and GPUs.

Expand All @@ -38,6 +38,6 @@ Here's how it works in practice:
- Lowers the task to the different backend (e.g., LLVM, CUDA, ROCm)
- Handles device setup and data movement

Thanks to the `CodeDrop` approach, integrating the Avial dialect into existing compiler pipelines is both trivial and non-intrusive. The process begins by identifying performance-critical regions such as loops, compute kernels, or math-heavy operations regardless of which dialect they're written in. These regions are then wrapped in a TaskOp. That’s it. From there, Avial takes full control, automatically lowering tasks to the appropriate execution backends including MPI for distributed execution and ultimately to LLVM IR.
Thanks to the `CodeDrop` approach, integrating the DHIR dialect into existing compiler pipelines is both trivial and non-intrusive. The process begins by identifying performance-critical regions such as loops, compute kernels, or math-heavy operations regardless of which dialect they're written in. These regions are then wrapped in a TaskOp. That’s it. From there, DHIR takes full control, automatically lowering tasks to the appropriate execution backends including MPI for distributed execution and ultimately to LLVM IR.

This approach not only simplifies integration but also scales easily across heterogeneous and distributed environments. Whether running on a single multicore CPU or across a CPU-GPU cluster with MPI, Avial ensures consistent handling of task distribution and coordination.
This approach not only simplifies integration but also scales easily across heterogeneous and distributed environments. Whether running on a single multicore CPU or across a CPU-GPU cluster with MPI, DHIR ensures consistent handling of task distribution and coordination.
4 changes: 2 additions & 2 deletions analysis/arrayPartitionAnalysis.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

namespace mlir
{
namespace avial
namespace dhir
{
struct ArrayPartitioningInfo
{
Expand Down Expand Up @@ -752,7 +752,7 @@ namespace mlir
return analysis.analyzeArray(memref);
}

} // namespace avial
} // namespace dhir
} // namespace mlir

#endif
18 changes: 9 additions & 9 deletions analysis/broadcastAnalysis.h
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,13 @@
#include <algorithm>
#include "arrayPartitionAnalysis.h"

#include "includes/avialOps.h"
#include "includes/avialDialect.h"
#include "includes/dhirOps.h"
#include "includes/dhirDialect.h"


namespace mlir
{
namespace avial
namespace dhir
{

// ========================================================================
Expand Down Expand Up @@ -48,7 +48,7 @@ namespace mlir

// Get all write arguments from this replicate operation

llvm::SmallVector<Value> writeArgs = mlir::dyn_cast<mlir::avial::ReplicateOp>(replicateOp).getWrites();
llvm::SmallVector<Value> writeArgs = mlir::dyn_cast<mlir::dhir::ReplicateOp>(replicateOp).getWrites();

if (writeArgs.empty()) {
llvm::errs() << "No write arguments found in replicate operation\n";
Expand Down Expand Up @@ -142,7 +142,7 @@ namespace mlir

// Walk through all operations in the parent

parentOp->walk([&](mlir::avial::ReplicateOp op) {
parentOp->walk([&](mlir::dhir::ReplicateOp op) {
// Skip until we find the current replicate operation
if (op == replicateOp) {
foundCurrentReplicate = true;
Expand All @@ -165,7 +165,7 @@ namespace mlir
// Check if a memref is read by a replicate operation
bool isReadByReplicate(mlir::Operation *replicate, Value memref)
{
llvm::SmallVector<Value> readArgs = mlir::dyn_cast<mlir::avial::ReplicateOp>(replicate).getReads();
llvm::SmallVector<Value> readArgs = mlir::dyn_cast<mlir::dhir::ReplicateOp>(replicate).getReads();
llvm::errs() << "Memref: ";
memref.dump();
for (Value readArg : readArgs) {
Expand Down Expand Up @@ -222,7 +222,7 @@ namespace mlir
return false; // Not found, default to no broadcast
}

} // namespace avial
} // namespace dhir
} // namespace mlir

/*
Expand All @@ -233,10 +233,10 @@ namespace mlir
* #include "BroadcastAnalysis.h"
*
* void processReplicateOp(mlir::Operation *replicateOp) {
* std::vector<mlir::avial::BroadcastInfo> broadcastDecisions;
* std::vector<mlir::dhir::BroadcastInfo> broadcastDecisions;
*
* // Analyze the replicate operation
* if (!mlir::avial::analyzeBroadcastRequirements(replicateOp, broadcastDecisions)) {
* if (!mlir::dhir::analyzeBroadcastRequirements(replicateOp, broadcastDecisions)) {
* llvm::errs() << "Broadcast analysis failed!\n";
* return;
* }
Expand Down
8 changes: 4 additions & 4 deletions analysis/depGraph.h
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
#include "mlir/Conversion/Passes.h"
#include "mlir/Pass/PassManager.h"

#include "includes/avialDialect.h"
#include "includes/dhirDialect.h"
#include "mlir/Dialect/MemRef/IR/MemRef.h"
#include "includes/avialOps.h"
#include "includes/dhirOps.h"
#include "mlir/Dialect/SCF/IR/SCF.h"

enum class TargetType
Expand Down Expand Up @@ -146,7 +146,7 @@ bool memoryAccessesConflict(mlir::Value val1, mlir::Value val2)

namespace mlir
{
namespace avial
namespace dhir
{

struct DependencyGraph
Expand All @@ -158,7 +158,7 @@ namespace mlir
bool hasLoop; // Whether tasks are inside a loop
mlir::scf::ForOp forLoop; // The loop containing tasks (if any)

void build(avial::ScheduleOp schedule)
void build(dhir::ScheduleOp schedule)
{
llvm::errs() << "-- Building task dependency graph\n";

Expand Down
6 changes: 3 additions & 3 deletions analysis/insoutAnalysis.h
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@
#include "mlir/IR/MLIRContext.h"
#include "mlir/IR/BuiltinOps.h"

#include "includes/avialOps.h"
#include "includes/avialDialect.h"
#include "includes/dhirOps.h"
#include "includes/dhirDialect.h"

struct InsOutsAnalysis
{
Expand All @@ -19,7 +19,7 @@ struct InsOutsAnalysis

InsOutsAnalysis(mlir::Operation *op)
{
if (auto schOp = mlir::dyn_cast<mlir::avial::ScheduleOp>(op))
if (auto schOp = mlir::dyn_cast<mlir::dhir::ScheduleOp>(op))
{
if (mlir::Block *block = &schOp.getRegion().front())
{
Expand Down
2 changes: 1 addition & 1 deletion analysis/polyhedralAnalysis.h
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@

namespace mlir
{
namespace avial
namespace dhir
{
int checkLoopDependence(mlir::affine::AffineForOp op, int depth)
{
Expand Down
34 changes: 17 additions & 17 deletions conversions/affinetoavial.h → conversions/affinetodhir.h
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
#include "mlir/Pass/PassManager.h"
#include "mlir/IR/PatternMatch.h"

#include "includes/avialDialect.h"
#include "includes/avialTypes.h"
#include "includes/dhirDialect.h"
#include "includes/dhirTypes.h"

#include "mlir/Transforms/DialectConversion.h"

Expand All @@ -14,9 +14,9 @@
#include "mlir/Dialect/Func/IR/FuncOps.h"
#include "mlir/Dialect/SCF/IR/SCF.h"

#include "includes/avialDialect.h"
#include "includes/avialOps.h"
#include "includes/avialTypes.h"
#include "includes/dhirDialect.h"
#include "includes/dhirOps.h"
#include "includes/dhirTypes.h"
#include "includes/utils.h"

#include "mlir/Dialect/DLTI/DLTI.h"
Expand All @@ -29,20 +29,20 @@
#include <string>

using namespace mlir;
using namespace avial;
using namespace dhir;

namespace mlir
{
namespace avial
namespace dhir
{

#define GEN_PASS_DEF_CONVERTAFFINETOAVIALPASS
#define GEN_PASS_DEF_CONVERTAFFINETODHIRPASS

#include "dialect/Passes.h.inc"

struct ConvertAffineToAvialPass : public mlir::avial::impl::ConvertAffineToAvialPassBase<ConvertAffineToAvialPass>
struct ConvertAffineToDhirPass : public mlir::dhir::impl::ConvertAffineToDhirPassBase<ConvertAffineToDhirPass>
{
using ConvertAffineToAvialPassBase::ConvertAffineToAvialPassBase;
using ConvertAffineToDhirPassBase::ConvertAffineToDhirPassBase;

// Helper function to check if a loop is independent (considering only its own iterations)
// This checks the loop in isolation, not in the context of parent loops
Expand Down Expand Up @@ -146,7 +146,7 @@ namespace mlir

bool isStencil = false;

mlir::avial::ArrayPartitioningAnalysis analysis(forOp);
mlir::dhir::ArrayPartitioningAnalysis analysis(forOp);
for (Value in : insouts[0])
{
auto info = analysis.analyzeArray(in);
Expand All @@ -162,7 +162,7 @@ namespace mlir
}

builder.setInsertionPoint(forOp);
auto replicateOp = builder.create<mlir::avial::ReplicateOp>(forOp.getLoc(), insouts[0], insouts[1]);
auto replicateOp = builder.create<mlir::dhir::ReplicateOp>(forOp.getLoc(), insouts[0], insouts[1]);
replicateOp->setAttr("replicateID", builder.getI64IntegerAttr(repId));

if (isStencil)
Expand All @@ -175,7 +175,7 @@ namespace mlir

forOp->moveBefore(newBlock, newBlock->end());
builder.setInsertionPointToEnd(newBlock);
builder.create<mlir::avial::YieldOp>(builder.getUnknownLoc());
builder.create<mlir::dhir::YieldOp>(builder.getUnknownLoc());

llvm::errs() << "Wrapped loop with ReplicateOp (replicateID=" << repId << ")\n";
++repId;
Expand Down Expand Up @@ -275,15 +275,15 @@ namespace mlir
auto insouts = InsOutsAnalysis::getInsandOut(forOp);

builder.setInsertionPoint(forOp);
auto replicateOp = builder.create<mlir::avial::ReplicateOp>(forOp.getLoc(), insouts[0], insouts[1]);
auto replicateOp = builder.create<mlir::dhir::ReplicateOp>(forOp.getLoc(), insouts[0], insouts[1]);
replicateOp->setAttr("replicateID", builder.getI64IntegerAttr(repId));

mlir::Region &replicateRegion = replicateOp.getBodyRegion();
mlir::Block *newBlock = builder.createBlock(&replicateRegion);

forOp->moveBefore(newBlock, newBlock->end());
builder.setInsertionPointToEnd(newBlock);
builder.create<mlir::avial::YieldOp>(builder.getUnknownLoc());
builder.create<mlir::dhir::YieldOp>(builder.getUnknownLoc());
++repId;
}

Expand All @@ -301,15 +301,15 @@ namespace mlir
auto insouts = InsOutsAnalysis::getInsandOut(forOp);

builder.setInsertionPoint(forOp);
auto convergeOp = builder.create<mlir::avial::ConvergeOp>(forOp.getLoc(), insouts[0], insouts[1]);
auto convergeOp = builder.create<mlir::dhir::ConvergeOp>(forOp.getLoc(), insouts[0], insouts[1]);
convergeOp->setAttr("ConvergeID", builder.getI64IntegerAttr(taskId));

mlir::Region &ConvergeRegion = convergeOp.getBodyRegion();
mlir::Block *newBlock = builder.createBlock(&ConvergeRegion);

forOp->moveBefore(newBlock, newBlock->end());
builder.setInsertionPointToEnd(newBlock);
builder.create<mlir::avial::YieldOp>(builder.getUnknownLoc());
builder.create<mlir::dhir::YieldOp>(builder.getUnknownLoc());

++taskId;
}
Expand Down
Loading