Skip to content

WIP: Add RPS based load option #65

Open
dagrayvid wants to merge 6 commits intoopenshift-psap:mainfrom
dagrayvid:dataset-split
Open

WIP: Add RPS based load option #65
dagrayvid wants to merge 6 commits intoopenshift-psap:mainfrom
dagrayvid:dataset-split

Conversation

@dagrayvid
Copy link
Copy Markdown
Collaborator

No description provided.

@dagrayvid dagrayvid requested a review from sjmonson October 21, 2024 14:05
Comment thread user.py
return logging.getLogger("user")

def _user_loop(self, test_end_time):
while self.stop_q.empty():
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought for the future. What if we have the main process SIGTERM (or SIGUSR1) the subprocesses as a stop message and write a custom signal handler to clean up?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah totally, that's long overdue. I was looking into it and should work on it sometime.

Comment thread load_test.py Outdated
log_reader_thread = logging_utils.init_logging(args.log_level, logger_q)

# Create processes and their Users
schedule_q = mp_ctx.Queue(1)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
schedule_q = mp_ctx.Queue(1)
schedule_q = mp_ctx.Queue(1)
schedule_q.cancel_join_thread()

Toggle cancel_join_thread() here to avoid the queue blocking on exit.

Comment thread load_test.py
# Initialize the request_q with 2*concurrency requests
for query in dataset.get_next_n_queries(2 * concurrency):
dataset_q.put(query)
request_q.put((None, query))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a clarity perspective I think it would be better to have this be a dict or object. E.g.

Suggested change
request_q.put((None, query))
request_q.put(dict(query=query, req_time=None))

or set a field on the query dict.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a field in the query dict is much more elegant!

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I like the idea of making it a field in the query dict.

Comment thread load_test.py
Comment on lines +37 to +38

return
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop this return?

Suggested change
return

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't but I wonder if adding a dedicated try-catch exception block in this function worth it. We currently catch all the cascade exceptions with the generic Exception class in the main function but it's probably not the cleanest way to handle the exception IMO.

Not suggesting this should be addressed in this PR but a follow-up PR to cleanup our exception handling might be good.

Copy link
Copy Markdown
Collaborator

@npalaska npalaska left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor nits and comments but this looks ready to go.

Comment thread config.yaml
load_options:
type: constant #Future options: loadgen, stair-step
concurrency: 1
type: rps #Options: concurrency, rps, loadgen, stair-step
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason you replaced the word constant load type with concurrency? imo, constant sounds more closer to Constant Load which is a Continuous stream of requests.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that constant is ambiguous, as RPS can also be constant. My other thinking is that we might later add dynamically changing RPS or dynamically changing concurrency so either RPS or concurrency could be constant or dynamic.

Comment thread load_test.py
Comment on lines +37 to +38

return
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't but I wonder if adding a dedicated try-catch exception block in this function worth it. We currently catch all the cascade exceptions with the generic Exception class in the main function but it's probably not the cleanest way to handle the exception IMO.

Not suggesting this should be addressed in this PR but a follow-up PR to cleanup our exception handling might be good.

Comment thread load_test.py

def main_loop_concurrency_mode(dataset, request_q, start_time, end_time):
"""Let all users send requests repeatedly until end_time"""
logging.info("Test from main process")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this logging statement here?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I'll remove this thanks!

Comment thread load_test.py
# Initialize the request_q with 2*concurrency requests
for query in dataset.get_next_n_queries(2 * concurrency):
dataset_q.put(query)
request_q.put((None, query))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I like the idea of making it a field in the query dict.

result.output_tokens_before_timeout = result.output_tokens
result.output_text = response

result.calculate_results()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if its time to depreciate the caikit_client_plugin?

Comment thread plugins/dummy_plugin.py

result.end_time = time.time()

result.calculate_results()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we do a cleanup we probably should remove this file.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this was originally added with the thought that it could be used in some test cases but we may want to remove it depending on how we decide to handle testing (unit tests, e2e tests, etc...)

Comment thread user.py
return logging.getLogger("user")

def _user_loop(self, test_end_time):
while self.stop_q.empty():
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah totally, that's long overdue. I was looking into it and should work on it sometime.

Comment thread user.py
except queue.Empty:
# if timeout passes, queue.Empty will be thrown
# User should check if stop_q has been set, else poll again
# self.debug.info("User waiting for a request to be scheduled")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this line be uncommented?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants