-
-
Notifications
You must be signed in to change notification settings - Fork 23
Description
My main goal after node optimizations is to support event tags ( ex. "protest ride" ); i started pondering.... and the idea grew a bit. 😊 I went through existing issues on db data, and put together some ideas for a table reorg:
New Tables:
all new tables use new names to avoid overwriting the old; splits calevent into several smaller tables, and simplifies caldaily. the summarize() query ( in the node optimization branch ) would handle merging all of these into a backwards compatible query ( for the events, ical, and search endpoints ); manage and delete are the main endpoints that would need updating. ( and unit tests. )
( potentially, for a little while, we could write to both sets of tables "just in case" ™️ -- tho it might make the code somewhat icky. )
LEDE:
this is the core bit of what was calevent. i think ledeis a nice name because it'd be unique in the code base. indicates it's a lead-in, the initial part of ride info but it can't stand fully on its own either.
- auto id: maybe call this
series_id? -- the ride and all its days binds to this id. - auto
createdtime - auto
modifiedtime: the node opts branch fixes modified data not working. i think. title: string, as it is now.summary: replacesdescrorganizer: replacesnamesecret: we can set this to null to block editing. ( when migrating, it becomes null ifcalevent.passwordis blank )published: int, this replaces both our 'change counter', andnot coalesce(hidden, 0).
when migrating data: the series_id will have to be set to the current calevent.id for every ride; that's the simplest route for backwards compatible links and images.
currently, the change counter is used both to generate image names for cache-busting and to notify ical readers about data changes. the table reorg would separate those usages.
published would be for content changes. let it start at 0, and increase to 1 when initially published; then count number of times an event has been saved to act like changes does. we could arrange things so that null revokes and hides the ride. ( when migrating, if the current password is null or blank, then published would be null; otherwise it'd be the opposite of hidden )
STATUS:
mimics caldaily, but no pkid; and simplifies status.
rationale: caldaily.pkid is easy to confuse with calevent.id; and, the date is enough to uniquely identify the status, so we don't even really need it! the primary key for status here would be a composite key: ymd + series_id.
- series_id: one or more status entries per ride
- ymd: date without time in YYYY-MM-DD format. the weird name to help remember that.
- news: string, replaces
newsflash - scheduled: replaces
eventstatus. instead of achar, uses 1, 0, or null. 1 is active, 0 is cancelled, null is delisted.
PRIVATE:
rationale, by putting all private data in its own table, we can easily export anonymized data by excluding this table(!) and, i think we could pretty easily create a PUBLIC table view based on its internal show settings: so it'd limit the chances of a coding error leaking personal details.
- series_id - exactly one private row for every ride
- private_email(string) -- long names, but really hammers it home.
- private_phone(string)
- private_contact(string)
- show_email: 0,1
- show_phone:
- show_contact
PRINT:
used for the print calendar. rationale, only a very small number of rides use this. partitioning off tiny here, helps show what it's used for.
- series_id: at most one print entry for every ride
- tiny_title: string; when migrating if its the same as the automunge, dont store the field.
- tiny_summary: string.
- add_email: 0,1. ex. whether to "add" the email to the print calendar.
- add_phone
- add_link
- add_contact
when migrating if the fields here would all be empty, we can skip adding a print entry.
WEB:
takes the place of 'weburl' and 'webname'
rationale: could potentially support multiple sites, could be bluesky handles , etc.
the primary key would be series_id + web_type.
- series_id: zero or more web entries for every ride
- web_type: ex. "url".
- web_text: human readable
- web_link: ex. "https://...."
IMAGE:
for any rides with images; rationale: not all rides have images;
- series_id: at most one image for every ride
- img_ext ( extension )
- img_alt ( alt text )
- img_num: int, increments monotonically every image update ( or if it was easy to determine; could be the image hash. )
- img_override: null, or a string for events that have images with specific filenames.
on disk, we store most images as calevent_id.ext, but report them as (calevent_id)-(image_num).ext for cache-busting. we pull the image num from calevent.changes, and we record a string with the format "calevent_id.ext" into the db. occasionally, there are images with specific names ( usually setup by one of the calendar team manually ) - which override that.
this change would give image its own counter, and would store it as an int, instead of a string. img_override would handle the rare case where we have given an image a special name.
LOCATION:
handles both the start, and optionally the end of the ride. the primary key would be loc_type + series_id. we could maybe have table views for START and FINISH
- series_id: one or more locations for every ride. ( ex. all rides have a start, and many have an end. )
- loc_type: "start" for start, "end" for end. would be a cool expansion to allow
YYYY-MM-DDvalues here to override the location for particular days. ( ex. a special time and location on a given day, maybe for shift hack meetups ) - loc_name
- loc_address
- loc_instructions: ex. replaces
locdetails - loc_hrmin: ex. replaces
eventtime. using this weird name b/c i sometimes confused what we expect it to store.ride_durationwould be calculated fromfinish.hrmin-start.hrmin. storingride_durationwould update thefinish.hrmin( as part of the manage endpoint ) - loc_hrmin_info: ex. replaces
timedetails
the proper db way probably wouldn't store loc_type and series. instead, it would give each location a unique id, and have the lede table store "location ids" for start and end. BUT that implies we could share locations, and i don't think we should. ( location here has more than just a geo-code. ) AND, i really like the idea that every table has the series id. i think that makes manually inspecting the db super easy.
TAG:
new! this is the main goal. the primary key would be tag_name + series_id
- series_id: zero or more tags for every ride.
- tag_name: string.
- tag_value: string, or the string "true". ( could use blank or null for a valueless tag, but that might be confusing. )
the tags could maybe some day be added directly by users, these could be some of the built-in tags:
- audience: family, ,adult -- i think "general" could be the absence of an audience tag
- pace ex. easy,steady,strenuous, etc.
no drop? - area: ex.vancouver, westside, eastside, clackmas. assumes "portland" if there isn't an area tag.
- safety: the string "true", or doesn't exist and isn't covid friendly.
- featured: the string "true", some extra details, or doesn't exist and isn't featured.
- loop: the string "true", or doesn't exist and isn't a loop.
- distance: one of the exiting strings: "0-3", ..., "15+"
- protest: "true", or doesn't exist and isn't a protest ride.
- ... etc.
LEGACY:
exists to respond to existing calendar/event-PKID urls that people might have bookmarked. the node code would catch those urls, find the new table data by matching the incoming `pkid, and join in the lede and status tables to return the right data.
series_id: points to the new migrated data. one legacy entry would exist for every existing caldaily; no new entries would be created after migration.pkid: from caldailyeventdate: from caldaily- eventstatus and newsflash: these would live in the status table, and don't need to be duplicated here.
in an ideal world, instead of using urls like -- ex. https://www.shift2bikes.org/calendar/event-21504 -- we could immediately start using urls likecalendar/event/SERIES_ID/YY-MM-DD, and update all current code to use that. any old saved bookmarks would route through the legacy table to interpret those 5 digit pkid numbers. anyone getting new links would use the new format and directly route to current data. we wouldn't need a pkid for new events.
BUT we have no idea how bike fun would react to a new url scheme, so we should keep event-NUMBER for now. AND, to be super super safe, we probably should assume bikefun is only good up till a signed int32 number. That's 10 digits, but limited to a max of 2,147,483,647. Figuring out how to support that without a pkid style table is a bit odd....
The simple case: Any request < 6 digits, we assume is an old bookmark, and we run through the legacy table, matching the pkid to find the right data.
The complicated case: all new urls that the backend generates would find the number of days since Jan 01 2008 ( the first day in the db is 2008-02-20 ) -- which is about 6507 days right now. And, if we used the "left" 5 numbers of the event-NUMBER as days, that would get us to Oct 17, 2066 ( 21474 days ) before overflowing.
For the 5 digits on the "right" side we would use the series id. We're currently at 10,024 total events, and we could get up to 83,647 events without worry -- someone smarter than me can guess how the two sides would interact above that; but we can keep them safely independent for a nice long time.
At any rate, the new node code could generate urls of the format event-(5 digits of days)(5 digits of series) -- and we could reference the day of any particular ride while still keeping bikefun ( and any possible integer limitations ) happy. For 40 years or so, at any rate.