Skip to content

Visual-Intelligence-UMN/Ink-Pulse

Repository files navigation

InkPulse

The repository is for InkPulse, a visual analytics system that supports the identification, search, and analysis of key interactions in human-AI co-writing.

The introduction video can be found here

Progress.mp4

Table of Contents

Getting Started

If running for the first time, ensure you have Node.js v18 or newer installed. You can check your version by running:

node -v

Install all dependencies by using:

npm install

Then, run the development server:

npm run dev

Open http://localhost:5173 with your browser to see the result.

Project Structure

.
├── .github/
│   └── workflows/        # github action files for app building and delopyment
│ 
├── src/
│   ├── components/       # Svelte UI components (charts, dialogs, panels)
│   ├── lib/              # Shared utilities/helpers
│   ├── routes/           # SvelteKit routes (pages + endpoints)
│   ├── workers/          # Web workers for background tasks
│   ├── app.d.ts          # App-level TypeScript declarations
│   └── app.html          # HTML template
│ 
├── static/
│   ├── backend/          # Python backend for data processing
│   ├── dataset/          # Processed datasets used by the app
│   └── patterns/         # User-saved patterns from visual exploration
│ 
├── package.json
└── README.md

Data Structure

InkPulse organizes writing session data across three hierarchical levels of abstraction: EventsEvent BlocksSession Info.

  • Events (Individual User Actions): the finest level of granularity captures each individual user action (e.g., insertion, deletion, accept AI suggestion) during a writing session. Each session is stored as a separate JSON file named [session_id].json, located at static/dataset/[dataset_name]/json folder.
  • Event Blocks (Grouped Actions): To facilitate analysis, individual events are grouped into event blocks. By default, an event block contains all consecutive actions a user performs while actively writing, ending when the user either requests AI suggestions or accepts an AI insertion. Each session is stored as [session_id].json within static/dataset/[dataset_name]/segment_results folder.
  • Session Info (Session-Level Metadata): this file (static/dataset/[dataset_name]/session.json) contains high-level metadata (e.g., topic, writing ID, AI model) for writing sessions. This JSON files contains all the writing sessions from a specific dataset, with each JSON object corresponds to one writing session.

Detailed Data Specifications

Below is the structure and examples for the three levels.

Events (Individual User Actions)

Location: static/dataset/[dataset_name]/json/[session_id].json Each file is a writing session with the following structure

Schema:

type Event {
  // Type of action
  name: "suggestion-open" | "text-insert" | "text-delete" | string;       
  text?: string;         // Text content involved in the action (if applicable)
  eventSource: "user" | "api"; // Source of the action
  event_time: string;    // Timestamp, e.g., "YYYY-MM-DD hh:mm:ss"
  progress: number;      // Document-level progress (0–1)
  pos: number;           // Character position in the document
}

type SessionEvents {
  init_text: string[];   // Initial text representing the topic
  init_time: string[];   // Timestamp(s) when the initial text was presented
  text: string[];        // Full text content (after applying actions)
  events: Event[];       // List of all actions in the session
}

Example: /static/dataset/creative/json/016...84f.json

Event Blocks (Grouped Actions)

Location: static/dataset/[dataset_name]/segment_results/[session_id].json

Schema:

  type EventBlock = {
    start_progress: number; // Document progress at segment start (0–1)
    end_progress: number;   // Document progress at segment end (0–1)
    start_time: number;     // Start time in seconds since session start
    end_time: number;       // End time in seconds since session start
    actions: number[];      // List of action IDs in this block

    // Other user-defined attributes (e.g., scores, text length)
    [key: string]: number | string | boolean | number[] | string[] | null;
  };

  type EventBlocksFile = EventBlock[];

Additional, user-defined attributes (e.g., scores, text length) can be added as needed.

Example: /static/dataset/creative/segment_results/016...84f.json

Session Info (Session-Level Metadata)

Location: static/dataset/[dataset_name]/session.json

High-level metadata for all writing sessions. Each JSON object represents one complete session. Only session_id is required; all other fields are user-defined based on analysis needs.

Schema:

type SessionInfo = {
  session_id: string; // Required: unique session identifier

  // Optional / user-defined fields:
  writer_id?: string; // Unique writer identifier
  topic?: string;     // Writing prompt/topic

  // Other user-defined attributes
  [key: string]: string | number | boolean | null | undefined;
};

type SessionInfoFile = SessionInfo[];

Example: /static/dataset/creative/session.json

How to import your own dataset

Data Preprocessing: You can use static/backend/index.ipynb to preprocess the data, a Google olab version is available here. You need to use [dataset_name].zip instead of folder [dataset_name].

This script takes in two files (i) data/session.jsonl, which saves the complete writing action logs as the format specified in the CoAuther Dataset Schema, and (ii) data.csv, which specific the session level data at least session_id and prompt_code. Sample can be checked in static/import_dataset/creative.csv. The outputed folder [dataset_name] will contain all the files as described in Data Structure.

Loading Data into InkPulse:

  • Method One: Running InkPulse Locally. Fork this repo and run InkPulse locally following the Getting Started instructions. Place the folder generated from last step within static/dataset and register your dataset ([dataset_name]) at static/dataset/dataset_name.json. You can then start your visual exploration.
  • Method Two: Upload Directly to the Website. Direct upload support is currently under development. Stay tuned for updates!

Use the following code to convert your dataset into local database. NOTE: only folder that in static/dataset will be detected.

npx tsx scripts/import-groups.ts

Or, you can upload a .zip file on the website.