Skip to content

c# example negative offset for Starts #177

@dbalikhin

Description

@dbalikhin

I'm trying to run a sample from readme file.

          // allocate space for ids and offsets
            int[] Ids = new int[128];
            int[] Starts = new int[128];
            int[] Ends = new int[128];

            // tokenize with loaded XLM Roberta tokenization and output ids and start and end offsets
            var outputCount = BlingFireUtils.TextToIdsWithOffsets(h, inBytes, inBytes.Length, Ids, Starts, Ends, Ids.Length, 0);

            Console.WriteLine(String.Format("return length: {0}", outputCount));
            if (outputCount >= 0)
            {
                Console.Write("tokens from offsets: [");
                for (int i = 0; i < outputCount; ++i)
                {
                    int startOffset = Starts[i];

Starts[0] will be -1 which will result to an exception while getting startOffset.

System.ArgumentOutOfRangeException: 'Non-negative number required. (Parameter 'offset')'

I can put a dummy fix - output results will be good.

if (Starts[0] == -1)
{
    Starts[0] = 0;
}

I'm not sure why I was getting negative offset.

System: Windows 11 x64 (getting the same error in WSL Ubuntu 22.04)
.NET 7.0 Console App

PS: I also have to add
Console.OutputEncoding = System.Text.Encoding.UTF8; to display UTF-8 properly in the console app, but it is explainable fix.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions