Skip to content

Question about how mamba chat training is done #35

@aravindkoti

Description

@aravindkoti

Hi,

First off, thanks for providing training code for mamba use cases.

I was looking at how the training for mamba-chat is done, something I'm unclear on is the "preprocess" function used in the class "ChatDataset" (in "/trainer/data.py"). Why does it return a dictionary with only the input ids and not labels

dict(input_ids = all_input_ids, labels=all_input_ids)

I'm a little confused, wouldn't we need the data of both the user and assistant to train a chatbot? I notice this same pattern a few other times in the training code so I wanted to ask

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions