[NEMO-351] Empowering Nemo with fast I/O using Apache Crail#219
[NEMO-351] Empowering Nemo with fast I/O using Apache Crail#219hy00nc wants to merge 279 commits intoapache:masterfrom
Conversation
| case LOCAL_FILE: | ||
| return DataStoreProperty.Value.LocalFileStore; | ||
| case REMOTE_FILE: | ||
| return DataStoreProperty.Value.GlusterFileStore; |
There was a problem hiding this comment.
Actually, I got stuck here without a solution: Converting the ControlMessage REMOTE_FILE to either CrailFileStore or GlusterFileStore. For now, I just substituted GlusterFileStore to CrailFileStore, and let GlusterFileStore be converted to LocalFileStore, which absolutely isn't the right way :( Any suggestions?
There was a problem hiding this comment.
I'd use two variables: GLUSTER_FILE and CRAIL_FILE
pom.xml
Outdated
| <version>${powermock.version}</version> | ||
| <scope>test</scope> | ||
| </dependency> | ||
| <dependency> |
There was a problem hiding this comment.
Can you remove L117-L160?
I'm not familiar with Crail, but perhaps crail-client is sufficient for Nemo to use Crail.
It'd be great to use minimum dependencies.
There was a problem hiding this comment.
I first tried using only crail-client, but ClassNotFoundException occurs. Sadly, I am not sure why it works this way, so I just put all modules Crail provides (in case the specific module is needed).
| cl.registerShortNameOfClass(JobConf.MaxTaskAttempt.class); | ||
| cl.registerShortNameOfClass(JobConf.FileDirectory.class); | ||
| cl.registerShortNameOfClass(JobConf.GlusterVolumeDirectory.class); | ||
| cl.registerShortNameOfClass(JobConf.CrailVolumeDirectory.class); |
There was a problem hiding this comment.
Can you file a new JIRA issue to convert JobConf parameters for directory paths (FileDirectory, GlusterVolumeDirectory, CrailVolumeDirectory) to DataStoreProperty parameters? To do this, the type of the possible values of the DataStoreProperty should also change from enum to class.
I suggest this change, because I think we should aim to integrate into the IR DAG as many configuration parameters as possible. Another benefit is that it saves the trouble of having to configure both JobConf and IR DAG to use a specific Crail/Gluster/File directory. Finally, this enables us to configure different directories for different IREdges.
A related issue is integrating the executor_json parameter into the IR DAG: https://issues.apache.org/jira/browse/NEMO-382
| * Interface for remote block stores (e.g., GlusterFS, CrailFS...). | ||
| */ | ||
|
|
||
| @DefaultImplementation(GlusterFileStore.class) |
There was a problem hiding this comment.
Remove. I don't see a particular reason to prefer Gluster over Crail.
There was a problem hiding this comment.
Actually RemoteFileStore is constructed by default in every job, and using Crail as the default would cause every job to fail when there is no CrailFileSystem detected on the nodes that run the job. So maybe GlusterFileStore can be the default?
runtime/executor/src/main/java/org/apache/nemo/runtime/executor/data/BlockManagerWorker.java
Outdated
Show resolved
Hide resolved
| return Optional.of(block); | ||
| } catch (final IOException e) { | ||
| throw new BlockFetchException(e); | ||
| } catch (Exception e) { |
There was a problem hiding this comment.
Please use a more specific Exception class.
runtime/executor/src/main/java/org/apache/nemo/runtime/executor/data/stores/CrailFileStore.java
Outdated
Show resolved
Hide resolved
| if (fs.lookup(filePath).get() == null) { | ||
| return Optional.empty(); | ||
| } else { | ||
| try { |
There was a problem hiding this comment.
Why use a try here, when we're already in the try clause starting at L110?
| } | ||
| } catch (final IOException e) { | ||
| throw new BlockFetchException(e); | ||
| } catch (final Exception e) { |
runtime/executor/src/main/java/org/apache/nemo/runtime/executor/data/stores/CrailFileStore.java
Outdated
Show resolved
Hide resolved
|
Any update on this PR? |
|
@wonook Let's close stale PRs. Do you plan to merge this to |
JIRA: NEMO-351: Empowering Nemo with fast I/O using Apache Crail
Major changes:
crailif they intend to use CrailFileStore during their job execution. (Crail configuration and deployment has to take place.)