Conversation
| polling_day = date(2026, 5, 8) | ||
| polling_day_text = "2026-05-08" |
There was a problem hiding this comment.
I've hard-coded the date as 8 May 2026 (i.e: 1 day after the "real" elections). We could read date in as a param if you wanted.
| # This person is standing in this ward, but we've put them down as | ||
| # standing for Labour and Co-operative Party (joint-party:53-119) | ||
| # whereas they are actually standing for Labour Party (PP53). | ||
| # Easy mistake, but we will need to fix it. | ||
| self.create_membership( | ||
| person_id=91518, | ||
| party=labour_and_coop_party, | ||
| ballot_id=f"local.havering.st-andrews.{polling_day_text}", | ||
| org=org, | ||
| ) |
There was a problem hiding this comment.
Looking back at the real 2022 ballots, we didn't actually handle this Labour/Labour & C-op issue correctly in all cases for the 2022 election.
Is that a function of the featureset of YNR at the time and something we do better now, or is this not actually a useful edge case to train on?
There was a problem hiding this comment.
I think this is a mistake in our data from back then. I wonder how we still have this problem, given the work Will DM did on importing results.
There was a problem hiding this comment.
(It's a useful thing for us to catch, but FWIW, the BBC will squash the parties together in the end product)
There was a problem hiding this comment.
OK. Do you think it is more useful to make a situation where the person already in the DB is just standing for a totally different party than they are listed with on the SOPN, or add that case as well, or just stick with this?
| (f"local.havering.heaton.{polling_day_text}", 91359), | ||
| (f"local.havering.heaton.{polling_day_text}", 91363), | ||
| (f"local.havering.heaton.{polling_day_text}", 43250), | ||
| (f"local.havering.mawneys.{polling_day_text}", 42931), | ||
| (f"local.havering.mawneys.{polling_day_text}", 42935), | ||
| (f"local.havering.mawneys.{polling_day_text}", 42937), | ||
| (f"local.havering.gooshays.{polling_day_text}", 91387), | ||
| (f"local.havering.gooshays.{polling_day_text}", 42361), | ||
| (f"local.havering.gooshays.{polling_day_text}", 91389), |
There was a problem hiding this comment.
Person IDs are pretty load-bearing, so I'm just hard-coding magic numbers here. Do you know off the top your head what would happen if we merged one of these into another person record in future. Will that just work or do we need to update this script if that happens?
There was a problem hiding this comment.
No it won't just work. I guess at this stage we can just raise a useful exception, but if you wanted you could catch that and look up the old ID form the PersonRedirect model.
symroe
left a comment
There was a problem hiding this comment.
I think this is ok, and I assume you've tested it.
I wonder if you've e.g added some election results and tried to delete objects? Or generally, I think we might have some issues where we can't cascade the deletes we need.
I also get that this isn't meant to be run from an empty DB, but it might be useful to document the data that needs to exist. For example the org is fetched with a .get, so it will fail if we don't have the right DB dump imported. This isn't a major problem, but it might be handy in the future to know what the pre-requisites are.
| (f"local.havering.heaton.{polling_day_text}", 91359), | ||
| (f"local.havering.heaton.{polling_day_text}", 91363), | ||
| (f"local.havering.heaton.{polling_day_text}", 43250), | ||
| (f"local.havering.mawneys.{polling_day_text}", 42931), | ||
| (f"local.havering.mawneys.{polling_day_text}", 42935), | ||
| (f"local.havering.mawneys.{polling_day_text}", 42937), | ||
| (f"local.havering.gooshays.{polling_day_text}", 91387), | ||
| (f"local.havering.gooshays.{polling_day_text}", 42361), | ||
| (f"local.havering.gooshays.{polling_day_text}", 91389), |
There was a problem hiding this comment.
No it won't just work. I guess at this stage we can just raise a useful exception, but if you wanted you could catch that and look up the old ID form the PersonRedirect model.
| # This person is standing in this ward, but we've put them down as | ||
| # standing for Labour and Co-operative Party (joint-party:53-119) | ||
| # whereas they are actually standing for Labour Party (PP53). | ||
| # Easy mistake, but we will need to fix it. | ||
| self.create_membership( | ||
| person_id=91518, | ||
| party=labour_and_coop_party, | ||
| ballot_id=f"local.havering.st-andrews.{polling_day_text}", | ||
| org=org, | ||
| ) |
There was a problem hiding this comment.
I think this is a mistake in our data from back then. I wonder how we still have this problem, given the work Will DM did on importing results.
| # This person is standing in this ward, but we've put them down as | ||
| # standing for Labour and Co-operative Party (joint-party:53-119) | ||
| # whereas they are actually standing for Labour Party (PP53). | ||
| # Easy mistake, but we will need to fix it. | ||
| self.create_membership( | ||
| person_id=91518, | ||
| party=labour_and_coop_party, | ||
| ballot_id=f"local.havering.st-andrews.{polling_day_text}", | ||
| org=org, | ||
| ) |
There was a problem hiding this comment.
(It's a useful thing for us to catch, but FWIW, the BBC will squash the parties together in the end product)
Added in b05d5ae |
Following on from our conversation about this, I did some testing and realised that just calling So I think the teardown code is actually good is at stands. I reckon this PR is probably good to go. |
This PR adds a management command which sets up some test data we can use for training.
I've based it off this SOPN
local.havering.2022-05-05.pdf
because it contains a number of useful features:
I think we can use the following 5 wards for training, and that would cover some large wards and the following names:
I've set up Harold Wood with 0 suggested candidates.
Heaton, Mawneys and Gooshays have all got some suggested candidates on them, and they all match the SOPN.
St Andrew's has suggested candidates on it which cover various classes of mistakes/errors.
The one case this SOPN doesn't cover is withdrawn candidacy. One option is I can add test data for another election with a real SOPN that has a withdrawal on it. However, another option we could consider is we could produce a version of this PDF for training purposes which has been doctored to include a withdrawal or 2 in our test wards for use in the training. Then that would keep things pretty neat. I actually think maybe the second option might be better but let me know what you think on that.
The way I've done this management command is:
You can run it, and it will remove the test election plus any related ballot/membership objects. This means we can dry run the training session and then "factory reset" the data, or run the session multiple times re-setting the data between each run.
However, this script doesn't create everything from scratch. It will not run against a completely empty/clean DB. I have tried to minimise this by making relations null where possible, but it does expect certain DB objects to already exist, so we do need to import a real dump of some description and then run this on top of that. Hopefully it should be relatively agnostic about the point in time the dump was taken though. i.e: if we take another export in a month's time then this should still run on top of that base.
Specifically, this script assumes the existence of:
Organizationobject forlocal-authority:haveringPostobjects for the 2022-onwards wards of HaveringPartyobjects for:PersonobjectsThere are probably some places I could DRY up or whatever, but I'm treating this more like throwaway code we will run once than code we plan to maintain for a long time.