Hello, I want to use this project in my benchmarks, but fails :(.
this repo: https://github.com/kostya/LangArena, benchmarks: Etc::LogParser, Template::Regex.
go version go1.26.0 darwin/arm64, MacBook m1.
reproduce std go results, go results quite slow:
git clone https://github.com/kostya/LangArena
cd LangArena/golang
./run log
Etc::LogParser: OK in 22.419s
./run regex
Template::Regex: OK in 8.960s
try coregex:
git checkout try_coregex
./run log
Etc::LogParser: ERR[actual=2303964, expected=2439432] in 34.569s
./run regex
Template::Regex: ERR[actual=2613794111, expected=3554506535] in 264.582s
as you can see it slower, and checksum is broken, this means results is wrong somewhere.
LogParser code:
func (p *LogParser) compilePatterns() {
patterns := []struct {
name string
re string
}{
{"errors", ` [5][0-9]{2} | [4][0-9]{2} `},
{"bots", `(?i)bot|crawler|scanner|spider|indexing|crawl|robot|spider`},
{"suspicious", `(?i)etc/passwd|wp-admin|\.\./`},
{"ips", `\d+\.\d+\.\d+\.35`},
{"api_calls", `/api/[^ " ]+`},
{"post_requests", `POST [^ ]* HTTP`},
{"auth_attempts", `(?i)/login|/signin`},
{"methods", `(?i)get|post|put`},
{"emails", `[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}`},
{"passwords", `password=[^&\s"]+`},
{"tokens", `token=[^&\s"]+|api[_-]?key=[^&\s"]+`},
{"sessions", `session[_-]?id=[^&\s"]+`},
{"peak_hours", `\[\d+/\w+/\d+:1[3-7]:\d+:\d+ [+\-]\d+\]`},
}
p.patterns = make([]struct {
name string
re *regexp.Regexp
}, len(patterns))
for i, pat := range patterns {
p.patterns[i] = struct {
name string
re *regexp.Regexp
}{
name: pat.name,
re: regexp.MustCompile(pat.re),
}
}
}
func (p *LogParser) Run(iteration_id int) {
matches := make(map[string]int)
for _, pattern := range p.patterns {
matches[pattern.name] = len(pattern.re.FindAllStringIndex(p.log, -1))
}
total := 0
for _, count := range matches {
total += count
}
p.checksumVal += uint32(total)
}
TemplateRegex code:
func (t *TemplateRegex) Run(iteration_id int) {
re := regexp.MustCompile(`\{\{\s*(.*?)\s*\}\}`)
result := re.ReplaceAllStringFunc(t.text, func(match string) string {
key := match[2 : len(match)-2]
key = strings.TrimSpace(key)
if val, ok := t.vars[key]; ok {
return val
}
return ""
})
t.rendered = result
t.checksum += uint32(len(t.rendered))
}
Hello, I want to use this project in my benchmarks, but fails :(.
this repo: https://github.com/kostya/LangArena, benchmarks: Etc::LogParser, Template::Regex.
go version go1.26.0 darwin/arm64, MacBook m1.
reproduce std go results, go results quite slow:
try coregex:
as you can see it slower, and checksum is broken, this means results is wrong somewhere.
LogParser code:
TemplateRegex code: