Skip to content

Feature/hansard 2026#2077

Open
Meesch wants to merge 8 commits into
developfrom
feature/hansard-2026
Open

Feature/hansard 2026#2077
Meesch wants to merge 8 commits into
developfrom
feature/hansard-2026

Conversation

@Meesch
Copy link
Copy Markdown
Contributor

@Meesch Meesch commented May 27, 2026

Nieuwe UK Data

@Meesch Meesch requested a review from JeltevanBoheemen May 27, 2026 08:17
Copy link
Copy Markdown
Contributor

@JeltevanBoheemen JeltevanBoheemen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done, seems like quite smoe work to figure out the data cleaning.

Remark on the corpus/file name: using new to indicate a version is not future proof. What happens when we get another update? We will end up with things like Index_Current_New and Index_Current_Old (actual example from another project). Maybe call it uk_hansard?

Some nitpicky code quality comments, feel free to ignore if you consider them too small to fix :)




def extract_date(path: str):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def extract_date(path: str):
def extract_date(path: str) -> Optional[str]:

Type annotations are not consistent in this file. Not a big deal, but if you're doing them anyway, might as well make them complete.

Annotating all parameters types has higher prio then return types imho.

Comment on lines +32 to +37
if 'daylord' in path:
return 'House of Lords'
elif 'debates' in path:
return 'House of Commons'
else:
return None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nitpicky, but noticed this in more functions in this file. else clause and return None is reduntant, though I would argue explicitly returning None is a ncie thing to do.

Suggested change
if 'daylord' in path:
return 'House of Lords'
elif 'debates' in path:
return 'House of Commons'
else:
return None
if 'daylord' in path:
return 'House of Lords'
elif 'debates' in path:
return 'House of Commons'
return None

speech_id = abbreviate_speech_id(full_speech_id)
previous_topic = ''
for key in topic_dict:
if float(key) > float(speech_id):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smart way to compare 10.6 to 10.7 :)

def lookup_person_attribute(lookup_tuple):
metadata_dict, id, name, label = lookup_tuple #name is only included for debugging purposes

id = id.split('/')[-1] if id else None # twfy ID is at the end of uri
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is twfy?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they work for you

Comment on lines +2 to +13
<p-multiSelect

[id]="filter.corpusField.name"
[disabled]="!options.length"
[filter]="options.length>=6"
[options]="options"
[virtualScroll]="true"
[virtualScrollItemSize]="60"
[maxSelectedLabels]=1
placeholder="Choose"
[ngModel]="data"
(onChange)="update($event.value)" ariaLabelledBy="legend-filter-{{filter.displayName | slugify}}" fluid>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to work well (tested on acc), but is a separate feature.
The filters are well tested. Should the virtualscroll be tested there? (not sure, may be fine like this).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants