Skip to content

Fix documentation and the blocking bugs for local backend #249

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

rabbull
Copy link

@rabbull rabbull commented Apr 2, 2025

This PR introduces serveral mini fixes to make the local backend runnable out-of-the-box again.

  1. Added an empty "modules" entry to the configuration structure of microbenchmarks.
  2. Supplemented unused parameters in the Python input interface of microbenchmarks.
  3. Updated the Local section in the usage documentation.
  4. Removed the version segment from the Docker build image tag.
  5. Fixed the format of the basic_image field in the configuration for the local backend.
  6. Added missing parameters to the invocation to update_function of the Local system.

Closes #248.

Summary by CodeRabbit

  • New Features

    • Introduced an additional configuration option for module management in benchmark setups.
  • Documentation

    • Updated commands and configuration details for storage, deployment, and container management, enhancing ease of use and clarity.
  • Refactor

    • Standardized benchmark input handling and refined deployment operations, including simplified image naming and streamlined update processes.

Copy link

coderabbitai bot commented Apr 2, 2025

Walkthrough

The changes update several benchmark configurations and function interfaces. Four microbenchmark JSON files now include a new "modules": [] key. Corresponding input scripts have revised the generate_input function signature by renaming parameters and adding a nosql_func parameter. The documentation has been updated with corrected commands for starting storage, launching Docker containers, and invoking functions. Additionally, minor adjustments were made in the SeBS modules—removing version strings from image names, dropping special cases for “local” deployments, and updating a deployment client method signature.

Changes

Files Change Summary
benchmarks/.../config.json (sleep, network-benchmark, clock-synchronization, server-reply) Added new key "modules": [] to each configuration file while preserving existing values.
benchmarks/.../input.py (sleep, network-benchmark, clock-synchronization, server-reply) Updated generate_input signature: renamed parameters (replacing input_buckets/output_buckets with benchmarks_bucket, input_paths, and output_paths) and added a new parameter nosql_func.
docs/usage.md Revised commands for storage startup, local deployment, and Docker container handling; updated JSON structures and examples.
sebs/benchmark.py, sebs/config.py, sebs/experiments/perf_cost.py Removed version info from Docker image names, eliminated a special case for "local" in language versions, and updated the deployment client’s update_function call with additional parameters.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant generate_input
    Caller->>generate_input: Call with (data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func)
    generate_input-->>Caller: Returns generated input dictionary
Loading
sequenceDiagram
    participant PerfCost
    participant DeploymentClient
    PerfCost->>DeploymentClient: update_function(function, benchmark, False, '')
    DeploymentClient-->>PerfCost: Returns update result
Loading

Assessment against linked issues

Objective Addressed Explanation
Fix local backend startup command (#248)

Suggested reviewers

  • mcopik

Poem

I'm a rabbit with a code-filled beat,
Hopping through JSON keys so neat.
New parameters and commands hop in play,
With each commit, bugs hop away.
I nibble carrots and code with delight,
Celebrating smooth changes from morning to night!

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai plan to trigger planning for file edits and PR creation.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
docs/usage.md (1)

180-180: Add language specification to code block

The fenced code block at line 180 is missing a language specification, which is recommended for proper syntax highlighting.

-```
+```bash
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

180-180: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3266d2d and 587fe8d.

📒 Files selected for processing (12)
  • benchmarks/000.microbenchmarks/010.sleep/config.json (1 hunks)
  • benchmarks/000.microbenchmarks/010.sleep/input.py (1 hunks)
  • benchmarks/000.microbenchmarks/020.network-benchmark/config.json (1 hunks)
  • benchmarks/000.microbenchmarks/020.network-benchmark/input.py (1 hunks)
  • benchmarks/000.microbenchmarks/030.clock-synchronization/config.json (1 hunks)
  • benchmarks/000.microbenchmarks/030.clock-synchronization/input.py (1 hunks)
  • benchmarks/000.microbenchmarks/040.server-reply/config.json (1 hunks)
  • benchmarks/000.microbenchmarks/040.server-reply/input.py (1 hunks)
  • docs/usage.md (2 hunks)
  • sebs/benchmark.py (1 hunks)
  • sebs/config.py (0 hunks)
  • sebs/experiments/perf_cost.py (1 hunks)
💤 Files with no reviewable changes (1)
  • sebs/config.py
🧰 Additional context used
🧬 Code Definitions (4)
benchmarks/000.microbenchmarks/010.sleep/input.py (4)
benchmarks/000.microbenchmarks/030.clock-synchronization/input.py (1)
  • generate_input (6-7)
benchmarks/000.microbenchmarks/020.network-benchmark/input.py (1)
  • generate_input (6-7)
benchmarks/000.microbenchmarks/040.server-reply/input.py (1)
  • generate_input (11-12)
sebs/benchmark.py (1)
  • generate_input (783-794)
sebs/experiments/perf_cost.py (1)
sebs/faas/system.py (1)
  • update_function (215-234)
benchmarks/000.microbenchmarks/020.network-benchmark/input.py (4)
benchmarks/000.microbenchmarks/030.clock-synchronization/input.py (1)
  • generate_input (6-7)
benchmarks/000.microbenchmarks/040.server-reply/input.py (1)
  • generate_input (11-12)
benchmarks/000.microbenchmarks/010.sleep/input.py (1)
  • generate_input (11-12)
sebs/benchmark.py (1)
  • generate_input (783-794)
benchmarks/000.microbenchmarks/030.clock-synchronization/input.py (4)
benchmarks/000.microbenchmarks/020.network-benchmark/input.py (1)
  • generate_input (6-7)
benchmarks/000.microbenchmarks/040.server-reply/input.py (1)
  • generate_input (11-12)
benchmarks/000.microbenchmarks/010.sleep/input.py (1)
  • generate_input (11-12)
sebs/benchmark.py (1)
  • generate_input (783-794)
🪛 markdownlint-cli2 (0.17.2)
docs/usage.md

180-180: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

🔇 Additional comments (15)
benchmarks/000.microbenchmarks/040.server-reply/config.json (1)

4-5: Adds required "modules" field to the configuration structure.

This change adds the previously missing "modules" field to the configuration, which is required by the BenchmarkConfig class as seen in benchmark.py. This fix addresses one of the PR objectives to ensure the local backend works correctly out-of-the-box.

benchmarks/000.microbenchmarks/010.sleep/config.json (1)

4-5: Adds required "modules" field to the configuration structure.

This change adds the previously missing "modules" field to the configuration, which is required by the BenchmarkConfig class as seen in benchmark.py. This fix ensures proper functionality when deserializing the configuration.

benchmarks/000.microbenchmarks/030.clock-synchronization/config.json (1)

4-5: Adds required "modules" field to the configuration structure.

The addition of the empty "modules" array ensures the configuration structure matches what's expected by the BenchmarkConfig deserializer. This consistency across benchmarks helps prevent runtime errors.

benchmarks/000.microbenchmarks/020.network-benchmark/config.json (1)

4-5: Adds required "modules" field to the configuration structure.

Adding the empty "modules" array ensures this benchmark's configuration follows the same pattern as other benchmarks, providing consistency and preventing potential deserialization issues in the BenchmarkConfig class.

sebs/benchmark.py (1)

433-437: Simplifies Docker image naming by removing version component.

This change removes the version segment from the Docker build image name, which streamlines the build process as mentioned in the PR objectives. By using a simpler naming convention, the system will likely be more maintainable and less prone to versioning-related issues.

benchmarks/000.microbenchmarks/010.sleep/input.py (1)

11-11: Function signature updated correctly

The generate_input function signature has been updated to match the interface defined in sebs/benchmark.py, which helps standardize the benchmark input generation across the codebase.

benchmarks/000.microbenchmarks/040.server-reply/input.py (1)

11-11: Function signature updated correctly

The generate_input function signature has been updated to match the interface defined in sebs/benchmark.py, which helps standardize the benchmark input generation across the codebase.

sebs/experiments/perf_cost.py (1)

86-86: Method signature update correctly implemented

The call to update_function has been updated to match the new method signature by adding two additional parameters: False for container_deployment and an empty string for container_uri. This change is consistent with the updated method signature in the system.py file and aligns with the PR objective to fix blocking bugs for local backend.

docs/usage.md (7)

77-79: Storage start command updated correctly

The command has been updated to use a configuration file instead of hardcoded port values, allowing for more flexible deployment. Using all parameter suggests it now starts both object and NoSQL storage services as defined in the configuration.


87-89: Configuration update command enhanced

The command now correctly adds architecture information (x64) to the configuration, which is a necessary parameter for local deployment. The output is also properly directed to a new configuration file.


94-133: Storage configuration structure properly updated

The JSON structure now correctly includes both object storage (minio) and NoSQL storage (scylladb) sections, providing a comprehensive configuration template. This aligns with the updated command that uses a configuration file.


139-140: Local start command correctly updated with container removal option

The command now includes the --remove-containers option, which is essential information for users who want to automatically clean up after testing. This change aligns with the note added at lines 185-186.


171-176: Function invocation command improved with dynamic values

The curl command now dynamically extracts the function URL and input data from the output JSON file using jq, which is a significant improvement over hardcoded values. This makes the documentation more general and maintainable.


182-183: Storage stop command updated to match the start command

The command now uses all parameter to stop all storage services, which is consistent with the updated start command using all parameter.


185-186: Helpful note about container removal

This addition provides important clarification about container lifecycle management, explaining that containers won't be automatically removed without the specific flag.

Comment on lines +6 to 7
def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func):
return {'output-bucket': output_buckets[0]}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix variable reference in function implementation

The function signature has been updated, but the implementation still references output_buckets[0] which no longer exists as a parameter. This will cause a runtime error.

def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func):
-    return {'output-bucket': output_buckets[0]}
+    return {'output-bucket': output_paths[0]}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func):
return {'output-bucket': output_buckets[0]}
def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func):
return {'output-bucket': output_paths[0]}
🧰 Tools
🪛 Ruff (0.8.2)

7-7: Undefined name output_buckets

(F821)

Comment on lines +6 to 7
def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func):
return {'output-bucket': output_buckets[0]}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix variable reference in function implementation

The function signature has been updated, but the implementation still references output_buckets[0] which no longer exists as a parameter. This will cause a runtime error.

def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func):
-    return {'output-bucket': output_buckets[0]}
+    return {'output-bucket': output_paths[0]}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func):
return {'output-bucket': output_buckets[0]}
def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func):
return {'output-bucket': output_paths[0]}
🧰 Tools
🪛 Ruff (0.8.2)

7-7: Undefined name output_buckets

(F821)

@@ -142,18 +168,22 @@ The output file `out_benchmark.json` will contain the information on containers

In our example, we can use `curl` to invoke the function with provided input:

```
curl 172.17.0.3:9000 --request POST --data '{"random_len": 10,"username": "testname"}' --header 'Content-Type: application/json'
```bash
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement, thanks!

jq '.deployment.local.storage = input' config/example.json out_storage.json > config/local_deployment.json
```bash
jq '.deployment.local.storage = input' config/example.json out_storage.json | \
jq '.experiments.architecture = "x64"' > config/local_deployment.json
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better idea - please add architecture to local part of config/example.json?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the suite doesn't read architecture from deployment section (the local part) of the configuration. Would you like me to add a logic that architecture defined in deployment section overwriting architecture defined in experiments? Or another workaround is sepcifying architecture when starting the function containers: ./sebs.py local start ... --architecture=x64.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use the CLI argument :)

@@ -430,11 +430,10 @@ def install_dependencies(self, output_dir):
)
else:
repo_name = self._system_config.docker_repository()
image_name = "build.{deployment}.{language}.{runtime}-{version}".format(
image_name = "build.{deployment}.{language}.{runtime}".format(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We added version tags recently, but I didn't manage to update the images in DockerHub. Please remove this change, and I will update images with new tags :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

However, would it be useful if the benchmark suite could fall back to images w/o version tags when images w/ version cannot be found on DockerHub? If so, I could implement it here.

@@ -46,9 +46,6 @@ def supported_language_versions(
) -> List[str]:
languages = self._system_config.get(deployment_name, {}).get("languages", {})
base_images = languages.get(language_name, {}).get("base_images", {})

if deployment_name == "local":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change necessary?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When bringing function container up, the benchmark suite will check if language-version pair is supported by the backend system (local in our case), which invokes this function to obtain the supported language-version pairs. However, currently this function returns empty set for local backend.

The reason is that in commit 52f80a1, the corresponding configuration of local system is now wrapped with a architecture key:

--- a/config/systems.json
+++ b/config/systems.json
@@ -17,11 +17,13 @@
     "languages": {
       "python": {
         "base_images": {
-          "3.7": "python:3.7-slim",
-          "3.8": "python:3.8-slim",
-          "3.9": "python:3.9-slim",
-          "3.10": "python:3.10-slim",
-          "3.11": "python:3.11-slim"
+          "x64": {
+            "3.7": "python:3.7-slim",
+            "3.8": "python:3.8-slim",
+            "3.9": "python:3.9-slim",
+            "3.10": "python:3.10-slim",
+            "3.11": "python:3.11-slim"
+          }
         },
         "images": [
           "run",
@@ -43,10 +45,12 @@
       },
       "nodejs": {
         "base_images": {
-          "14": "node:14-slim",
-          "16": "node:16-slim",
-          "18": "node:18-slim",
-          "20": "node:20-slim"
+          "x64": {
+            "14": "node:14-slim",
+            "16": "node:16-slim",
+            "18": "node:18-slim",
+            "20": "node:20-slim"
+          }
         },
         "images": [
           "run",

As a result, this if branch for local here is not relevant anymore.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, correct

Copy link
Author

@rabbull rabbull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mcopik,

Thanks for the review! I'm happy to have the opportunity to contribute to this repository.
I've replied to some of your comments regarding the further development plans. Once you've had a chance to review them, I'll start refining my PR.

jq '.deployment.local.storage = input' config/example.json out_storage.json > config/local_deployment.json
```bash
jq '.deployment.local.storage = input' config/example.json out_storage.json | \
jq '.experiments.architecture = "x64"' > config/local_deployment.json
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the suite doesn't read architecture from deployment section (the local part) of the configuration. Would you like me to add a logic that architecture defined in deployment section overwriting architecture defined in experiments? Or another workaround is sepcifying architecture when starting the function containers: ./sebs.py local start ... --architecture=x64.

@@ -430,11 +430,10 @@ def install_dependencies(self, output_dir):
)
else:
repo_name = self._system_config.docker_repository()
image_name = "build.{deployment}.{language}.{runtime}-{version}".format(
image_name = "build.{deployment}.{language}.{runtime}".format(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

However, would it be useful if the benchmark suite could fall back to images w/o version tags when images w/ version cannot be found on DockerHub? If so, I could implement it here.

@@ -46,9 +46,6 @@ def supported_language_versions(
) -> List[str]:
languages = self._system_config.get(deployment_name, {}).get("languages", {})
base_images = languages.get(language_name, {}).get("base_images", {})

if deployment_name == "local":
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When bringing function container up, the benchmark suite will check if language-version pair is supported by the backend system (local in our case), which invokes this function to obtain the supported language-version pairs. However, currently this function returns empty set for local backend.

The reason is that in commit 52f80a1, the corresponding configuration of local system is now wrapped with a architecture key:

--- a/config/systems.json
+++ b/config/systems.json
@@ -17,11 +17,13 @@
     "languages": {
       "python": {
         "base_images": {
-          "3.7": "python:3.7-slim",
-          "3.8": "python:3.8-slim",
-          "3.9": "python:3.9-slim",
-          "3.10": "python:3.10-slim",
-          "3.11": "python:3.11-slim"
+          "x64": {
+            "3.7": "python:3.7-slim",
+            "3.8": "python:3.8-slim",
+            "3.9": "python:3.9-slim",
+            "3.10": "python:3.10-slim",
+            "3.11": "python:3.11-slim"
+          }
         },
         "images": [
           "run",
@@ -43,10 +45,12 @@
       },
       "nodejs": {
         "base_images": {
-          "14": "node:14-slim",
-          "16": "node:16-slim",
-          "18": "node:18-slim",
-          "20": "node:20-slim"
+          "x64": {
+            "14": "node:14-slim",
+            "16": "node:16-slim",
+            "18": "node:18-slim",
+            "20": "node:20-slim"
+          }
         },
         "images": [
           "run",

As a result, this if branch for local here is not relevant anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failed to run on local backend out-of-the-box
2 participants