Skip to content

Comments

feat: decouple log configuration from apifylog#3399

Open
l2ysho wants to merge 42 commits intov4from
3068-decouple-log-configuration-from-apifylog
Open

feat: decouple log configuration from apifylog#3399
l2ysho wants to merge 42 commits intov4from
3068-decouple-log-configuration-from-apifylog

Conversation

@l2ysho
Copy link
Contributor

@l2ysho l2ysho commented Feb 9, 2026

Introduces a CrawleeLogger interface and BaseCrawleeLogger abstract class so users can plug in any logger (Winston, Pino, Bunyan, etc.) instead of being locked to @apify/log.

  1. CrawleeLogger interface — the contract any logger must satisfy
  2. BaseCrawleeLogger abstract class — implement _log() + _createChild(), get 13 methods for free
  3. Configuration.loggerProvider — the injection point

How it works internally

Every module that needs logging now calls config.getLogger() instead of importing @apify/log directly. Two patterns:

Pattern When
config.getLogger().child({ prefix }) Classes that receive config (BasicCrawler, AutoscaledPool, etc.)
Configuration.getGlobalConfig().getLogger() Standalone functions without config access (request.ts, cookie_utils.ts, etc.)

Only two files still import @apify/log directly: log.ts (defines interface) and configuration.ts (default fallback).

  • Example — Winston adapter in ~15 lines:
class WinstonAdapter extends BaseCrawleeLogger {
   constructor(private logger: winston.Logger, options?: Partial<CrawleeLoggerOptions>) {
       super(options);
   }

    protected _log(level: number, message: string, data?: Record<string, any> | null): void {
       const winstonLevel = { 1: 'error', 2: 'warn', 3: 'warn', 4: 'info', 5: 'debug', 6: 'debug' }[level] ?? 'info';
        this.logger.log(winstonLevel, message, { ...data, prefix: this.getOptions().prefix });
    }

    protected _createChild(options: Partial<CrawleeLoggerOptions>): CrawleeLogger {
        return new WinstonAdapter(this.logger.child({ prefix: options.prefix }), { ...this.getOptions(), ...options });
    }
}

@l2ysho l2ysho self-assigned this Feb 9, 2026
l2ysho and others added 5 commits February 17, 2026 19:04
Allow connecting to remote browser services (Browserless, BrowserBase,
Playwright Server) instead of always launching local browsers. (#1822)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@l2ysho l2ysho changed the title 3068 decouple log configuration from apifylog feat: decouple log configuration from apifylog Feb 19, 2026
@l2ysho l2ysho marked this pull request as ready for review February 20, 2026 09:27
@l2ysho l2ysho requested review from B4nan and barjin February 20, 2026 09:28
@l2ysho
Copy link
Contributor Author

l2ysho commented Feb 20, 2026

ready for review, user now only need to define 2 functions and wrapper handle others.

Copy link
Member

@barjin barjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @l2ysho 🙌 I only have some minor notes ⬇️

Comment on lines 18 to 24
// Lazy singleton — evaluated on first use so a user-configured logger is picked up
// rather than the default that exists at module load time.
let _log: CrawleeLogger | undefined;
const getLog = () => {
_log ??= Configuration.getGlobalConfig().getLogger().child({ prefix: 'Request' });
return _log;
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit / refactor: can getLog(ger) be a static method on Request, just to make the design a bit tidier?

/**
* Internal logging method used by some Crawlee internals.
*/
internal(level: number, message: string, data?: Record<string, unknown>, exception?: Error): void;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is exposed in the public interface, we should assume people will use this (especially if there is no other method that does this).

How about naming this logger.logWithLevel or sth similar and add a friendly comment to encourage the (power)users instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean rename just in interface and keep the internal implementation in Abstract Class?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose you'll need to rename both the interface member and the implementing function, but yes, it's just about the name / JSDoc for me. I believe our more advanced users could find usecases for this method too, no need to hide it from them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am digging deep into it and I am not sure how you mean it. Internal function is implemented in apify/log, you want this change to bubble up to apify/log package?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry if I'm confusing.

I'd prefer to have a more descriptive name for the internal method, as it's public. This change would have to rename both CrawleeLogger.internal and BaseCrawleeLogger.internal.

As Honza mentions in the other comment

We don't really need to keep @apify/log directly assignable to CrawlingContext.log

so the @apify/log's interface doesn't really bother me now (ok it does - for the same reason as here - but we don't need to fix it now :)).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry if I'm confusing.

I believe I am the source of my confusion :]

But I have this example, this is one of a few usages of .internal -> basic-crawler.ts

this.log.internal(LogLevel[(options.level as 'DEBUG') ?? 'DEBUG'], message, data);

this.log is either @apify/log or custom logger provided by user. So i have to ensure that it exists on both. I cannot rename it in CrawleeLogger and BaseCrawleeLogger because I have to match @apify/log

Or I am missing something? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Honza mentions in the other comment

Feel free to tweak stuff to a reasonable degree; we can afford BC breaks. We don't really need to keep @apify/log directly assignable to CrawlingContext.log. It's no problem if that's going to require some additional wrapper or even adding a new method to @apify/log.

I.e. right now, you don't have to match the @apify/log and CrawleeLogger APIs (I believe that was the original reason for these changes; we want to free ourselves from @apify/log and create our own Crawlee thing).

this.log is either @apify/log or custom logger provided by user.

imo after merging this PR, this.log should implement CrawleeLogger interface (and that's all we'll know about it). Whether it's a custom user-supplied logger or a CrawleeLogger wrapper for @apify/log should be inconsequential.

Let's keep the .internal() method for now if it's easier. I don't want to muddle your PR with minor things like this :) It's still good that we discussed this, because maybe I'm understanding the motivations behind this PR wrong.

Can we please get a third pair of eyes @janbuchar? 🙏


if (BasicCrawler.useStateCrawlerIds.size > 1) {
defaultLog.warningOnce(
this.log.warningOnce(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a semantic change - originally, we would call warningOnce on the global instance (so it would log the warning only once).

Now we're calling warningOnce on each crawler's instance separately. These do not share the state, so this will log the message n times (n == crawler count).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it matter really? Feels all right to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It logs the same multiline message multiple times... not the end of the world, but imo unnecessary. I noticed this just because I dealt with this when implementing this not so long ago :)

If using the global log instance proves too complicated, I'm fine with this; if it's a one-line change, I'd prefer clean logs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yop, also claude suggested this to me but I was not believe him at the moment. I can take a look, probably I can store the state and pass it to child.

Copy link
Contributor Author

@l2ysho l2ysho Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed by static logger instance for basic-crawler

@janbuchar janbuchar self-requested a review February 20, 2026 11:30
@l2ysho
Copy link
Contributor Author

l2ysho commented Feb 23, 2026

feedback from martin

Important (Should Fix)

  • Fix stale lazy singleton in request.ts after config reset
  • Fix exception() discarding the Error object in log.ts
  • Complete decoupling — wire 7 files still using @apify/log through Configuration.getLogger()
  • Standardize data parameter type in CrawleeLogger interface

Suggestions (Nice to Have)

  • Add loggerProvider to Configuration JSDoc table
  • Add end-to-end integration test for custom logger via Configuration
  • Fix ProxyConfiguration bypassing injected config
  • Add null guard for getOptions().prefix in basic-crawler.ts
  • Add explanation for SOFT_FAIL mapping in Winston example

@janbuchar
Copy link
Contributor

@l2ysho is it fine if I postpone my review until the conflicts are resolved and Jindra's feedback incorporated?

@l2ysho
Copy link
Contributor Author

l2ysho commented Feb 23, 2026

@l2ysho is it fine if I postpone my review until the conflicts are resolved and Jindra's feedback incorporated?

Definitelly

@l2ysho
Copy link
Contributor Author

l2ysho commented Feb 24, 2026

@janbuchar I think you can start, I addressed most critical issues from feedback. In the meantime I have still 10 minor/nice-to-have issues in stack to check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants