This project demonstrates how to load and process the Delaney Solubility dataset using pandas. The dataset contains chemical descriptors and their corresponding solubility values, which can be used to predict the solubility of a compound.
The dataset used in this project is the Delaney Solubility Dataset, which is available from the Data Professor's GitHub repository. The dataset contains the following columns:
- logS: The solubility value of the compound in logarithmic scale.
- Various chemical descriptors that represent different properties of the compounds.
The following Python libraries are required to run the script:
- pandas: For data manipulation and analysis.
- sklearn : For data splitting and prediction.
- matplotlib : For plotting the prediction graph.
You can install the required dependencies using pip:
pip install pandas
pip install sklearn
pip install matplotlib