Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 19 additions & 18 deletions ENV.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,24 +32,25 @@ Task flags are used to enable/disable certain tasks. They are set to `1` to enab

The following flags are available:

| Flag | Description |
|---------------------------------------|-------------------------------------------------------|
| `SCHEDULED_TASKS_FLAG` | All scheduled tasks. |
| `URL_HTML_TASK_FLAG` | URL HTML scraping task. |
| `URL_RECORD_TYPE_TASK_FLAG` | Automatically assigns Record Types to URLs. |
| `URL_AGENCY_IDENTIFICATION_TASK_FLAG` | Automatically assigns and suggests Agencies for URLs. |
| `URL_SUBMIT_APPROVED_TASK_FLAG` | Submits approved URLs to the Data Sources App. |
| `URL_MISC_METADATA_TASK_FLAG` | Adds misc metadata to URLs. |
| `URL_404_PROBE_TASK_FLAG` | Probes URLs for 404 errors. |
| `URL_AUTO_RELEVANCE_TASK_FLAG` | Automatically assigns Relevances to URLs. |
| `URL_PROBE_TASK_FLAG` | Probes URLs for web metadata. |
| `URL_ROOT_URL_TASK_FLAG` | Extracts and links Root URLs to URLs. |
| `SYNC_AGENCIES_TASK_FLAG` | Synchonize agencies from Data Sources App. |
| `SYNC_DATA_SOURCES_TASK_FLAG` | Synchonize data sources from Data Sources App. |
| `PUSH_TO_HUGGING_FACE_TASK_FLAG` | Pushes data to HuggingFace. |
| `POPULATE_BACKLOG_SNAPSHOT_TASK_FLAG` | Populates the backlog snapshot. |
| `DELETE_OLD_LOGS_TASK_FLAG` | Deletes old logs. |
| `RUN_URL_TASKS_TASK_FLAG` | Runs URL tasks. |
| Flag | Description |
|---------------------------------------|--------------------------------------------------------|
| `SCHEDULED_TASKS_FLAG` | All scheduled tasks. |
| `URL_HTML_TASK_FLAG` | URL HTML scraping task. |
| `URL_RECORD_TYPE_TASK_FLAG` | Automatically assigns Record Types to URLs. |
| `URL_AGENCY_IDENTIFICATION_TASK_FLAG` | Automatically assigns and suggests Agencies for URLs. |
| `URL_SUBMIT_APPROVED_TASK_FLAG` | Submits approved URLs to the Data Sources App. |
| `URL_MISC_METADATA_TASK_FLAG` | Adds misc metadata to URLs. |
| `URL_404_PROBE_TASK_FLAG` | Probes URLs for 404 errors. |
| `URL_AUTO_RELEVANCE_TASK_FLAG` | Automatically assigns Relevances to URLs. |
| `URL_PROBE_TASK_FLAG` | Probes URLs for web metadata. |
| `URL_ROOT_URL_TASK_FLAG` | Extracts and links Root URLs to URLs. |
| `SYNC_AGENCIES_TASK_FLAG` | Synchonize agencies from Data Sources App. |
| `SYNC_DATA_SOURCES_TASK_FLAG` | Synchonize data sources from Data Sources App. |
| `PUSH_TO_HUGGING_FACE_TASK_FLAG` | Pushes data to HuggingFace. |
| `POPULATE_BACKLOG_SNAPSHOT_TASK_FLAG` | Populates the backlog snapshot. |
| `DELETE_OLD_LOGS_TASK_FLAG` | Deletes old logs. |
| `RUN_URL_TASKS_TASK_FLAG` | Runs URL tasks. |
| `IA_PROBE_TASK_FLAG` | Extracts and links Internet Archives metadata to URLs. |


## Foreign Data Wrapper (FDW)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
"""Add Internet Archive Tables

Revision ID: 2a7192657354
Revises: 49fd9f295b8d
Create Date: 2025-08-14 07:22:15.308210

"""
from typing import Sequence, Union

from alembic import op
import sqlalchemy as sa

from src.util.alembic_helpers import url_id_column, created_at_column, id_column, updated_at_column, switch_enum_type

# revision identifiers, used by Alembic.
revision: str = '2a7192657354'
down_revision: Union[str, None] = '49fd9f295b8d'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None

IA_METADATA_TABLE_NAME = "urls_internet_archive_metadata"
IA_FLAGS_TABLE_NAME = "flag_url_checked_for_internet_archive"

def upgrade() -> None:

Check warning on line 24 in alembic/versions/2025_08_14_0722-2a7192657354_add_internet_archive_tables.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] alembic/versions/2025_08_14_0722-2a7192657354_add_internet_archive_tables.py#L24 <103>

Missing docstring in public function
Raw output
./alembic/versions/2025_08_14_0722-2a7192657354_add_internet_archive_tables.py:24:1: D103 Missing docstring in public function
_create_metadata_table()
_create_flags_table()
_add_internet_archives_task_enum()

def downgrade() -> None:

Check warning on line 29 in alembic/versions/2025_08_14_0722-2a7192657354_add_internet_archive_tables.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] alembic/versions/2025_08_14_0722-2a7192657354_add_internet_archive_tables.py#L29 <103>

Missing docstring in public function
Raw output
./alembic/versions/2025_08_14_0722-2a7192657354_add_internet_archive_tables.py:29:1: D103 Missing docstring in public function
op.drop_table(IA_METADATA_TABLE_NAME)
op.drop_table(IA_FLAGS_TABLE_NAME)
_remove_internet_archives_task_enum()


def _create_metadata_table():
op.create_table(
IA_METADATA_TABLE_NAME,
id_column(),
url_id_column(),
sa.Column('archive_url', sa.String(), nullable=False),
sa.Column('digest', sa.String(), nullable=False),
sa.Column('length', sa.Integer(), nullable=False),
created_at_column(),
updated_at_column(),
sa.UniqueConstraint('url_id', name='uq_url_id_internet_archive_metadata')
)

def _add_internet_archives_task_enum():
switch_enum_type(
table_name='tasks',
column_name='task_type',
enum_name='task_type',
new_enum_values=[
'HTML',
'Relevancy',
'Record Type',
'Agency Identification',
'Misc Metadata',
'Submit Approved URLs',
'Duplicate Detection',
'404 Probe',
'Sync Agencies',
'Sync Data Sources',
'Push to Hugging Face',
'URL Probe',
'Populate Backlog Snapshot',
'Delete Old Logs',
'Run URL Task Cycles',
'Root URL',
'Internet Archives Probe',
'Internet Archives Archive'
]
)

def _remove_internet_archives_task_enum():
switch_enum_type(
table_name='tasks',
column_name='task_type',
enum_name='task_type',
new_enum_values=[
'HTML',
'Relevancy',
'Record Type',
'Agency Identification',
'Misc Metadata',
'Submit Approved URLs',
'Duplicate Detection',
'404 Probe',
'Sync Agencies',
'Sync Data Sources',
'Push to Hugging Face',
'URL Probe',
'Populate Backlog Snapshot',
'Delete Old Logs',
'Run URL Task Cycles',
'Root URL',
]
)

def _create_flags_table():
op.create_table(
IA_FLAGS_TABLE_NAME,
url_id_column(),
sa.Column('success', sa.Boolean(), nullable=False),
created_at_column(),
sa.PrimaryKeyConstraint('url_id')
)

Check warning on line 108 in alembic/versions/2025_08_14_0722-2a7192657354_add_internet_archive_tables.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] alembic/versions/2025_08_14_0722-2a7192657354_add_internet_archive_tables.py#L108 <391>

blank line at end of file
Raw output
./alembic/versions/2025_08_14_0722-2a7192657354_add_internet_archive_tables.py:108:1: W391 blank line at end of file
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ version = "0.1.0"
requires-python = ">=3.11"
dependencies = [
"aiohttp~=3.11.11",
"aiolimiter>=1.2.1",
"alembic~=1.14.0",
"apscheduler~=3.11.0",
"asyncpg~=0.30.0",
Expand Down
6 changes: 5 additions & 1 deletion src/api/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
from src.db.client.sync import DatabaseClient
from src.external.huggingface.hub.client import HuggingFaceHubClient
from src.external.huggingface.inference.client import HuggingFaceInferenceClient
from src.external.internet_archives.client import InternetArchivesClient
from src.external.pdap.client import PDAPClient
from src.external.url_request.core import URLRequestInterface

Expand Down Expand Up @@ -81,7 +82,7 @@ async def lifespan(app: FastAPI):
hf_inference_client=HuggingFaceInferenceClient(
session=session,
token=env_var_manager.hf_inference_api_key
)
),
),
)
async_collector_manager = AsyncCollectorManager(
Expand All @@ -104,6 +105,9 @@ async def lifespan(app: FastAPI):
token=env_var_manager.hf_hub_token
),
async_core=async_core,
ia_client=InternetArchivesClient(
session=session
)
),
registry=ScheduledJobRegistry()
)
Expand Down
6 changes: 1 addition & 5 deletions src/core/exceptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,6 @@
from fastapi import HTTPException


class InvalidPreprocessorError(Exception):
pass


class MuckrockAPIError(Exception):
pass

Expand All @@ -17,4 +13,4 @@ class MatchAgencyError(Exception):

class FailedValidationException(HTTPException):
def __init__(self, detail: str):
super().__init__(status_code=HTTPStatus.BAD_REQUEST, detail=detail)
super().__init__(status_code=HTTPStatus.BAD_REQUEST, detail=detail)
19 changes: 15 additions & 4 deletions src/core/tasks/base/operator.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import traceback
from abc import ABC, abstractmethod

from src.core.enums import BatchStatus
from src.core.tasks.base.run_info import TaskOperatorRunInfo
from src.core.tasks.url.enums import TaskOperatorOutcome
from src.db.client.async_ import AsyncDatabaseClient
Expand All @@ -9,8 +10,18 @@

class TaskOperatorBase(ABC):
def __init__(self, adb_client: AsyncDatabaseClient):
self.adb_client = adb_client
self.task_id = None
self._adb_client = adb_client
self._task_id: int | None = None

@property
def task_id(self) -> int:

Check warning on line 17 in src/core/tasks/base/operator.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/base/operator.py#L17 <102>

Missing docstring in public method
Raw output
./src/core/tasks/base/operator.py:17:1: D102 Missing docstring in public method
if self._task_id is None:
raise AttributeError("Task id is not set. Call initiate_task_in_db() first.")
return self._task_id

@property
def adb_client(self) -> AsyncDatabaseClient:

Check warning on line 23 in src/core/tasks/base/operator.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/base/operator.py#L23 <102>

Missing docstring in public method
Raw output
./src/core/tasks/base/operator.py:23:1: D102 Missing docstring in public method
return self._adb_client

@property
@abstractmethod
Expand All @@ -27,8 +38,8 @@
async def conclude_task(self):
raise NotImplementedError

async def run_task(self, task_id: int) -> TaskOperatorRunInfo:
self.task_id = task_id
async def run_task(self) -> TaskOperatorRunInfo:

Check warning on line 41 in src/core/tasks/base/operator.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/base/operator.py#L41 <102>

Missing docstring in public method
Raw output
./src/core/tasks/base/operator.py:41:1: D102 Missing docstring in public method
self._task_id = await self.initiate_task_in_db()
try:
await self.inner_task_logic()
return await self.conclude_task()
Expand Down
10 changes: 0 additions & 10 deletions src/core/tasks/dtos/run_info.py

This file was deleted.

1 change: 0 additions & 1 deletion src/core/tasks/handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@

from src.core.enums import BatchStatus
from src.core.tasks.base.run_info import TaskOperatorRunInfo
from src.core.tasks.dtos.run_info import URLTaskOperatorRunInfo
from src.core.tasks.url.enums import TaskOperatorOutcome
from src.db.client.async_ import AsyncDatabaseClient
from src.db.enums import TaskType
Expand Down
File renamed without changes.
43 changes: 43 additions & 0 deletions src/core/tasks/mixins/link_urls.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
from abc import abstractmethod

Check warning on line 1 in src/core/tasks/mixins/link_urls.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/mixins/link_urls.py#L1 <100>

Missing docstring in public module
Raw output
./src/core/tasks/mixins/link_urls.py:1:1: D100 Missing docstring in public module

from src.db.client.async_ import AsyncDatabaseClient


class LinkURLsMixin:

Check warning on line 6 in src/core/tasks/mixins/link_urls.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/mixins/link_urls.py#L6 <101>

Missing docstring in public class
Raw output
./src/core/tasks/mixins/link_urls.py:6:1: D101 Missing docstring in public class

def __init__(

Check warning on line 8 in src/core/tasks/mixins/link_urls.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/mixins/link_urls.py#L8 <107>

Missing docstring in __init__
Raw output
./src/core/tasks/mixins/link_urls.py:8:1: D107 Missing docstring in __init__
self,
*args,
**kwargs
):
super().__init__(*args, **kwargs)
self._urls_linked = False
self._linked_url_ids = []

@property
def urls_linked(self) -> bool:

Check warning on line 18 in src/core/tasks/mixins/link_urls.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/mixins/link_urls.py#L18 <102>

Missing docstring in public method
Raw output
./src/core/tasks/mixins/link_urls.py:18:1: D102 Missing docstring in public method
return self._urls_linked

@property
def linked_url_ids(self) -> list[int]:

Check warning on line 22 in src/core/tasks/mixins/link_urls.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/mixins/link_urls.py#L22 <102>

Missing docstring in public method
Raw output
./src/core/tasks/mixins/link_urls.py:22:1: D102 Missing docstring in public method
return self._linked_url_ids

@property
@abstractmethod
def adb_client(self) -> AsyncDatabaseClient:

Check warning on line 27 in src/core/tasks/mixins/link_urls.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/mixins/link_urls.py#L27 <102>

Missing docstring in public method
Raw output
./src/core/tasks/mixins/link_urls.py:27:1: D102 Missing docstring in public method
raise NotImplementedError

@property
@abstractmethod
def task_id(self) -> int:

Check warning on line 32 in src/core/tasks/mixins/link_urls.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/mixins/link_urls.py#L32 <102>

Missing docstring in public method
Raw output
./src/core/tasks/mixins/link_urls.py:32:1: D102 Missing docstring in public method
raise NotImplementedError

async def link_urls_to_task(self, url_ids: list[int]):

Check warning on line 35 in src/core/tasks/mixins/link_urls.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/mixins/link_urls.py#L35 <102>

Missing docstring in public method
Raw output
./src/core/tasks/mixins/link_urls.py:35:1: D102 Missing docstring in public method
self._linked_url_ids = url_ids
if not hasattr(self, "linked_url_ids"):
raise AttributeError("Class does not have linked_url_ids attribute")
await self.adb_client.link_urls_to_task(
task_id=self.task_id,
url_ids=url_ids
)
self._urls_linked = True

Check warning on line 43 in src/core/tasks/mixins/link_urls.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/mixins/link_urls.py#L43 <292>

no newline at end of file
Raw output
./src/core/tasks/mixins/link_urls.py:43:33: W292 no newline at end of file
15 changes: 15 additions & 0 deletions src/core/tasks/mixins/prereq.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
from abc import ABC, abstractmethod

Check warning on line 1 in src/core/tasks/mixins/prereq.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/mixins/prereq.py#L1 <100>

Missing docstring in public module
Raw output
./src/core/tasks/mixins/prereq.py:1:1: D100 Missing docstring in public module


class HasPrerequisitesMixin(ABC):

Check warning on line 4 in src/core/tasks/mixins/prereq.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/mixins/prereq.py#L4 <101>

Missing docstring in public class
Raw output
./src/core/tasks/mixins/prereq.py:4:1: D101 Missing docstring in public class

def __init__(self, *args, **kwargs):

Check warning on line 6 in src/core/tasks/mixins/prereq.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/mixins/prereq.py#L6 <107>

Missing docstring in __init__
Raw output
./src/core/tasks/mixins/prereq.py:6:1: D107 Missing docstring in __init__
super().__init__(*args, **kwargs)

@abstractmethod
async def meets_task_prerequisites(self) -> bool:
"""
A task should not be initiated unless certain
conditions are met
"""
raise NotImplementedError

Check warning on line 15 in src/core/tasks/mixins/prereq.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/mixins/prereq.py#L15 <292>

no newline at end of file
Raw output
./src/core/tasks/mixins/prereq.py:15:34: W292 no newline at end of file
5 changes: 3 additions & 2 deletions src/core/tasks/scheduled/enums.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,6 @@


class IntervalEnum(Enum):
DAILY = "DAILY"
HOURLY = "HOURLY"
DAILY = 60 * 24
HOURLY = 60
TEN_MINUTES = 10

Check warning on line 7 in src/core/tasks/scheduled/enums.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/scheduled/enums.py#L7 <292>

no newline at end of file
Raw output
./src/core/tasks/scheduled/enums.py:7:21: W292 no newline at end of file
Empty file.
Empty file.
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
from src.core.tasks.mixins.link_urls import LinkURLsMixin

Check warning on line 1 in src/core/tasks/scheduled/impl/internet_archives/archive/operator.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/scheduled/impl/internet_archives/archive/operator.py#L1 <100>

Missing docstring in public module
Raw output
./src/core/tasks/scheduled/impl/internet_archives/archive/operator.py:1:1: D100 Missing docstring in public module
from src.core.tasks.mixins.prereq import HasPrerequisitesMixin
from src.core.tasks.scheduled.templates.operator import ScheduledTaskOperatorBase
from src.db.client.async_ import AsyncDatabaseClient
from src.db.enums import TaskType
from src.external.internet_archives.client import InternetArchivesClient


class InternetArchivesArchiveTaskOperator(

Check warning on line 9 in src/core/tasks/scheduled/impl/internet_archives/archive/operator.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/scheduled/impl/internet_archives/archive/operator.py#L9 <101>

Missing docstring in public class
Raw output
./src/core/tasks/scheduled/impl/internet_archives/archive/operator.py:9:1: D101 Missing docstring in public class
ScheduledTaskOperatorBase,
HasPrerequisitesMixin,
LinkURLsMixin
):

def __init__(

Check warning on line 15 in src/core/tasks/scheduled/impl/internet_archives/archive/operator.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/scheduled/impl/internet_archives/archive/operator.py#L15 <107>

Missing docstring in __init__
Raw output
./src/core/tasks/scheduled/impl/internet_archives/archive/operator.py:15:1: D107 Missing docstring in __init__
self,
adb_client: AsyncDatabaseClient,
ia_client: InternetArchivesClient
):
super().__init__(adb_client)
self.ia_client = ia_client

async def meets_task_prerequisites(self) -> bool:

Check warning on line 23 in src/core/tasks/scheduled/impl/internet_archives/archive/operator.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/scheduled/impl/internet_archives/archive/operator.py#L23 <102>

Missing docstring in public method
Raw output
./src/core/tasks/scheduled/impl/internet_archives/archive/operator.py:23:1: D102 Missing docstring in public method
raise NotImplementedError

@property
def task_type(self) -> TaskType:

Check warning on line 27 in src/core/tasks/scheduled/impl/internet_archives/archive/operator.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/scheduled/impl/internet_archives/archive/operator.py#L27 <102>

Missing docstring in public method
Raw output
./src/core/tasks/scheduled/impl/internet_archives/archive/operator.py:27:1: D102 Missing docstring in public method
return TaskType.IA_ARCHIVE

async def inner_task_logic(self) -> None:

Check warning on line 30 in src/core/tasks/scheduled/impl/internet_archives/archive/operator.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/scheduled/impl/internet_archives/archive/operator.py#L30 <102>

Missing docstring in public method
Raw output
./src/core/tasks/scheduled/impl/internet_archives/archive/operator.py:30:1: D102 Missing docstring in public method
raise NotImplementedError
Empty file.
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
from src.external.internet_archives.models.ia_url_mapping import InternetArchivesURLMapping

Check warning on line 1 in src/core/tasks/scheduled/impl/internet_archives/probe/convert.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/scheduled/impl/internet_archives/probe/convert.py#L1 <100>

Missing docstring in public module
Raw output
./src/core/tasks/scheduled/impl/internet_archives/probe/convert.py:1:1: D100 Missing docstring in public module
from src.db.models.impl.flag.checked_for_ia.pydantic import FlagURLCheckedForInternetArchivesPydantic

Check warning on line 2 in src/core/tasks/scheduled/impl/internet_archives/probe/convert.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/scheduled/impl/internet_archives/probe/convert.py#L2 <401>

'src.db.models.impl.flag.checked_for_ia.pydantic.FlagURLCheckedForInternetArchivesPydantic' imported but unused
Raw output
./src/core/tasks/scheduled/impl/internet_archives/probe/convert.py:2:1: F401 'src.db.models.impl.flag.checked_for_ia.pydantic.FlagURLCheckedForInternetArchivesPydantic' imported but unused
from src.db.models.impl.url.ia_metadata.pydantic import URLInternetArchiveMetadataPydantic
from src.util.url_mapper import URLMapper


def convert_ia_url_mapping_to_ia_metadata(

Check warning on line 7 in src/core/tasks/scheduled/impl/internet_archives/probe/convert.py

View workflow job for this annotation

GitHub Actions / flake8

[flake8] src/core/tasks/scheduled/impl/internet_archives/probe/convert.py#L7 <103>

Missing docstring in public function
Raw output
./src/core/tasks/scheduled/impl/internet_archives/probe/convert.py:7:1: D103 Missing docstring in public function
url_mapper: URLMapper,
ia_mapping: InternetArchivesURLMapping
) -> URLInternetArchiveMetadataPydantic:
iam = ia_mapping.ia_metadata
return URLInternetArchiveMetadataPydantic(
url_id=url_mapper.get_id(ia_mapping.url),
archive_url=iam.archive_url,
digest=iam.digest,
length=iam.length
)
Loading