Skip to content

Canonicalize pin paths #4274

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
117 changes: 117 additions & 0 deletions docs/PinPaths.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# Pin path design

On Linux, objects can be pinned to a filesystem path. As a result:
* Pins can be enumerated or removed via normal filesystem APIs/utilities
* Pin paths are case-sensitive
* The path component separator is a forward slash ('/')
* Pin paths typically begin with "/sys/fs/bpf/", but could begin with other
paths. Relative paths like "mymap" are then relative to that prefix.
* ".." and "." components are resolved.

On Windows, pin paths are not currently part of the filesystem. As a result,
pins cannot be removed via normal filesystem deletion APIs/utilities and instead
ebpf_get_next_pinned_object_path() and ebpf_object_unpin() are exposed
for enumeration and unpinning, respectively.

This leaves open the question of what syntax(es) we support on Windows for pin paths.

## Criteria for Evaluation

The following are criteria for evaluating each option:

1. Slashes: Paths must be accepted using either '/' or '\\' and be canonicalized.

1. Kernel: Canonicalization must be the same in kernel-mode (for native) and user-mode (for JIT).

1. LocalFile: A pin path should not collide with the path of another file in the Windows filesystem.

1. RemoteFile: Avoid confusion with remote paths.

1. FileAPIs: Paths should work with Windows file APIs in the future once a BPF filesystem is implemented.

1. CaseSensitive: Paths should be case-sensitive.

## Options

### Native (Actual file system path)

Example: C:\ebpf\global\my\pin\path

In this option, querying the path of a pinned object would give an actual file system path.
This might be confusing if the path conflicts with an actual path where there may be files,
so additional care would be needed to fail such a pin.

Furthermore, an actual file system path would be case-insensitive rather than case-sensitive.
Another implementation challenge is that the logic for forming pin paths for native programs
is in the kernel (to minimize security concerns), and there isn't an easy way to get the
system drive, and picking "C:" would be fairly arbitrary.

### Mnt (Actual file system path in Linux-style syntax)

Example: /mnt/c/ebpf/global/my/pin/path

Like the previous option, the current or system drive letter is still needed, so this has
the same issues as the previous option.

### Virtual (Windows device path with another virtual drive for BPF)

Example: BPF:\my\pin\path
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being a windows noob, is it possible to have multi letter drives? Never seen one of those.

Also, would this drive show up in the explorer?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a drive, but a virtual device (aka DOS device name). There are a bunch of these in windows, like CON:, PRN:, NUL:, COM1:, etc. And drivers can add new ones.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated wording to explain.


Note that this is a device path not a "file system" path per se.

In this option, there would be no collisions with actual files, and the path can be case
sensitive if the "BPF:" driver declares it as such. Similarly, ".." resolution would work if
the driver implements it as such.

As for portability, canonicalization could ensure that Linux style paths like
"/sys/fs/bpf/my/pin/path" would be canonicalized to "BPF:\sys\fs\bpf\my\pin\path" allowing
portability in that respect. Querying the pin path using a Windows-specific
API on an object would show the canonical path.

### Linux (Linux-like file system path)

Example: /sys/fs/bpf/my/pin/path

In this option, the path would look the same as on Linux where
the canonical form uses forward slash rather than backslash.
It cannot be used with other Windows file system APIs in the future.

### UNC (UNC path)

Example: \\BPF\my\pin\path
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also \\?\ which would disable all path parsing and would forward the path to the filesystem verbatim. And \\.\ which points into the object namespace, but I don't know whether we could hook that somehow?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a brief note.


In this option, "BPF" would be a special "host" like "localhost" or "?".
However, it may be confusing if there is another host with the name "bpf",
or at least lead people to believe there might be.

### Others

There are also "\\?\" and "\\.\" paths on Windows, but these don't
appear to provide any value that one or more of the other options don't
already provide so are omitted from the rest of the evaluation.

## Evaluation Matrix

The matrix below shows how each option evaluates using the
criteria discussed above. "Y" means the option is good,
"N" means the option is bad, and "-" means it would be somewhere
in between good and bad.

| Criteria | Native | Mnt | Virtual | Linux | UNC |
| ------------- | ------ | --- | ------- | ----- | --- |
| Slashes | Y | Y | Y | Y | Y |
| Kernel | - | - | Y | Y | Y |
| LocalFile | - | - | Y | Y | Y |
| RemoteFile | Y | Y | Y | Y | N |
| FileAPIs | Y | N | Y | N | Y |
| CaseSensitive | N | N | Y | Y | Y |

## Decision

Based on the evaluation matrix, the Virtual form
("BPF:\my\pin\path") will be the canonical form.

In addition, for portability, the Linux form ("/sys/fs/bpf/my/pin/path")
will be accepted as valid, as will the older eBPF for Windows
path ("/ebpf/global/my/pin/path"). These will be canonicalized
to "BPF:\my\pin\path" internally.
1 change: 1 addition & 0 deletions ebpfapi/Source.def
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,7 @@ EXPORTS
ebpf_api_close_handle
ebpf_api_get_pinned_map_info
ebpf_api_map_info_free
ebpf_canonicalize_pin_path
ebpf_close_fd
ebpf_duplicate_fd
ebpf_enumerate_programs
Expand Down
15 changes: 14 additions & 1 deletion include/ebpf_api.h
Original file line number Diff line number Diff line change
Expand Up @@ -603,7 +603,7 @@ extern "C"
/**
* @brief Retrieve the next pinned path of an eBPF object.
*
* @param[in] start_path Path to look for an entry greater than or NULL.
* @param[in] start_path Path to look for an entry greater than.
* @param[out] next_path Returns the next path in lexicographical order, if one exists.
* @param[in] next_path_len Length of the next path buffer.
* @param[in, out] type On input, the type of object to retrieve or EBPF_OBJECT_UNKNOWN.
Expand All @@ -619,6 +619,19 @@ extern "C"
size_t next_path_len,
_Inout_ ebpf_object_type_t* type) EBPF_NO_EXCEPT;

/**
* @brief Canonicalize a path using filesystem canonicalization rules.
*
* @param[out] output Buffer in which to write canonicalized path.
* @param[in] output_size Size of output buffer.
* @param[out] error_code Zero on success, non-zero Win32 error code on failure.
*
* @retval EBPF_SUCCESS The operation was successful.
* @retval EBPF_INVALID_ARGUMENT The input path was invalid.
*/
ebpf_result_t
ebpf_canonicalize_pin_path(_Out_writes_(output_size) char* output, size_t output_size, _In_z_ const char* input) EBPF_NO_EXCEPT;

typedef struct _ebpf_program_info ebpf_program_info_t;

/**
Expand Down
57 changes: 46 additions & 11 deletions libs/api/ebpf_api.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1273,6 +1273,12 @@ ebpf_object_pin(fd_t fd, _In_z_ const char* path) NO_EXCEPT_TRY
ebpf_handle_t handle;

ebpf_assert(path);
char canonical_path[MAX_PATH];
uint32_t return_value = ebpf_canonicalize_path(canonical_path, sizeof(canonical_path), path);
if (return_value != ERROR_SUCCESS) {
EBPF_RETURN_RESULT(EBPF_INVALID_ARGUMENT);
}

if (fd <= 0) {
EBPF_RETURN_RESULT(EBPF_INVALID_ARGUMENT);
}
Expand All @@ -1282,14 +1288,14 @@ ebpf_object_pin(fd_t fd, _In_z_ const char* path) NO_EXCEPT_TRY
EBPF_RETURN_RESULT(EBPF_INVALID_FD);
}

auto path_length = strlen(path);
auto path_length = strlen(canonical_path);
ebpf_protocol_buffer_t request_buffer(offsetof(ebpf_operation_update_pinning_request_t, path) + path_length);
auto request = reinterpret_cast<ebpf_operation_update_pinning_request_t*>(request_buffer.data());

request->header.id = ebpf_operation_id_t::EBPF_OPERATION_UPDATE_PINNING;
request->header.length = static_cast<uint16_t>(request_buffer.size());
request->handle = handle;
std::copy(path, path + path_length, request->path);
std::copy(canonical_path, canonical_path + path_length, request->path);
result = win32_error_code_to_ebpf_result(invoke_ioctl(request_buffer));

EBPF_RETURN_RESULT(result);
Expand All @@ -1301,14 +1307,21 @@ ebpf_object_unpin(_In_z_ const char* path) NO_EXCEPT_TRY
{
EBPF_LOG_ENTRY();
ebpf_assert(path);
auto path_length = strlen(path);
char canonical_path[MAX_PATH];
uint32_t return_value = ebpf_canonicalize_path(canonical_path, sizeof(canonical_path), path);
if (return_value != ERROR_SUCCESS) {
ebpf_result_t result = win32_error_code_to_ebpf_result(return_value);
EBPF_RETURN_RESULT(result);
}

auto path_length = strlen(canonical_path);
ebpf_protocol_buffer_t request_buffer(offsetof(ebpf_operation_update_pinning_request_t, path) + path_length);
auto request = reinterpret_cast<ebpf_operation_update_pinning_request_t*>(request_buffer.data());

request->header.id = ebpf_operation_id_t::EBPF_OPERATION_UPDATE_PINNING;
request->header.length = static_cast<uint16_t>(request_buffer.size());
request->handle = UINT64_MAX;
std::copy(path, path + path_length, request->path);
std::copy(canonical_path, canonical_path + path_length, request->path);
EBPF_RETURN_RESULT(win32_error_code_to_ebpf_result(invoke_ioctl(request_buffer)));
}
CATCH_NO_MEMORY_EBPF_RESULT
Expand Down Expand Up @@ -1401,16 +1414,22 @@ ebpf_object_get(_In_z_ const char* path, _Out_ fd_t* fd) NO_EXCEPT_TRY
{
EBPF_LOG_ENTRY();
ebpf_assert(fd);
ebpf_assert(path);

size_t path_length = strlen(path);
char canonical_path[MAX_PATH];
uint32_t return_value = ebpf_canonicalize_path(canonical_path, sizeof(canonical_path), path);
if (return_value != ERROR_SUCCESS) {
EBPF_RETURN_RESULT(EBPF_INVALID_ARGUMENT);
}

size_t path_length = strlen(canonical_path);
ebpf_protocol_buffer_t request_buffer(offsetof(ebpf_operation_get_pinned_object_request_t, path) + path_length);
auto request = reinterpret_cast<ebpf_operation_get_pinned_object_request_t*>(request_buffer.data());
ebpf_operation_get_pinned_object_reply_t reply;
ebpf_assert(path);

request->header.id = ebpf_operation_id_t::EBPF_OPERATION_GET_PINNED_OBJECT;
request->header.length = static_cast<uint16_t>(request_buffer.size());
std::copy(path, path + path_length, request->path);
std::copy(canonical_path, canonical_path + path_length, request->path);
auto result = invoke_ioctl(request_buffer, reply);
if (result != ERROR_SUCCESS) {
*fd = (fd_t)ebpf_fd_invalid;
Expand Down Expand Up @@ -4182,10 +4201,16 @@ ebpf_get_next_pinned_object_path(
ebpf_assert(start_path);
ebpf_assert(next_path);

size_t start_path_length = strlen(start_path);
char canonical_start_path[MAX_PATH];
ebpf_result_t result = ebpf_canonicalize_path(canonical_start_path, sizeof(canonical_start_path), start_path);
if (result != ERROR_SUCCESS) {
EBPF_RETURN_RESULT(EBPF_INVALID_ARGUMENT);
}

size_t canonical_start_path_length = strlen(canonical_start_path);

ebpf_protocol_buffer_t request_buffer(
EBPF_OFFSET_OF(ebpf_operation_get_next_pinned_object_path_request_t, start_path) + start_path_length);
EBPF_OFFSET_OF(ebpf_operation_get_next_pinned_object_path_request_t, start_path) + canonical_start_path_length);
ebpf_protocol_buffer_t reply_buffer(
EBPF_OFFSET_OF(ebpf_operation_get_next_pinned_object_path_reply_t, next_path) + (next_path_len - 1));
ebpf_operation_get_next_pinned_object_path_request_t* request =
Expand All @@ -4198,10 +4223,10 @@ ebpf_get_next_pinned_object_path(
reply->header.length = static_cast<uint16_t>(reply_buffer.size());

request->type = *type;
memcpy(request->start_path, start_path, start_path_length);
memcpy(request->start_path, canonical_start_path, canonical_start_path_length);

uint32_t error = invoke_ioctl(request_buffer, reply_buffer);
ebpf_result_t result = win32_error_code_to_ebpf_result(error);
result = win32_error_code_to_ebpf_result(error);
if (result != EBPF_SUCCESS) {
EBPF_RETURN_RESULT(result);
}
Expand All @@ -4214,6 +4239,16 @@ ebpf_get_next_pinned_object_path(
}
CATCH_NO_MEMORY_EBPF_RESULT

_Must_inspect_result_ ebpf_result_t
ebpf_canonicalize_pin_path(_Out_writes_(output_size) char* output, size_t output_size, _In_z_ const char* input)
NO_EXCEPT_TRY
{
EBPF_LOG_ENTRY();
ebpf_result_t result = ebpf_canonicalize_path(output, output_size, input);
EBPF_RETURN_RESULT(result);
}
CATCH_NO_MEMORY_EBPF_RESULT

static ebpf_result_t
_get_next_id(ebpf_operation_id_t operation, ebpf_id_t start_id, _Out_ ebpf_id_t* next_id) NO_EXCEPT_TRY
{
Expand Down
21 changes: 8 additions & 13 deletions libs/execution_context/ebpf_native.c
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@

#include <intrin.h>

#define DEFAULT_PIN_ROOT_PATH "/ebpf/global"
#define EBPF_MAX_PIN_PATH_LENGTH 256

static const uint32_t _ebpf_native_marker = 'entv';
Expand Down Expand Up @@ -1020,29 +1019,25 @@ _ebpf_native_initialize_maps(

if (entry->definition.pinning == LIBBPF_PIN_BY_NAME) {
// Construct the pin path.
size_t prefix_length = strnlen(DEFAULT_PIN_ROOT_PATH, EBPF_MAX_PIN_PATH_LENGTH);
size_t name_length = strnlen_s(entry->name, BPF_OBJ_NAME_LEN);
if (name_length == 0 || name_length >= BPF_OBJ_NAME_LEN ||
prefix_length + name_length + 1 >= EBPF_MAX_PIN_PATH_LENGTH) {
char canonical_path[EBPF_MAX_PIN_PATH_LENGTH];
result = ebpf_canonicalize_path(canonical_path, sizeof(canonical_path), entry->name);
if (result != EBPF_SUCCESS) {
EBPF_LOG_MESSAGE_GUID(
EBPF_TRACELOG_LEVEL_ERROR,
EBPF_TRACELOG_KEYWORD_NATIVE,
"_ebpf_native_initialize_maps: map pin path too long",
"_ebpf_native_initialize_maps: invalid map pin path",
module_id);
result = EBPF_INVALID_ARGUMENT;
goto Done;
}

native_maps[i].pin_path.value =
ebpf_allocate_with_tag(prefix_length + name_length + 1, EBPF_POOL_TAG_NATIVE);
size_t length = strlen(canonical_path);
native_maps[i].pin_path.value = ebpf_allocate_with_tag(length, EBPF_POOL_TAG_NATIVE);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Off by one here for NUL?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The NUL isn't needed here since the length is stored. This isn't a change from before, which also didn't include space for a NUL. The +1 before was for the "/", as can be seen in old lines 1042-1045.

if (native_maps[i].pin_path.value == NULL) {
result = EBPF_NO_MEMORY;
goto Done;
}
native_maps[i].pin_path.length = prefix_length + name_length + 1;
memcpy(native_maps[i].pin_path.value, DEFAULT_PIN_ROOT_PATH, prefix_length);
memcpy(native_maps[i].pin_path.value + prefix_length, "/", 1);
memcpy(native_maps[i].pin_path.value + prefix_length + 1, entry->name, name_length);
native_maps[i].pin_path.length = length;
memcpy(native_maps[i].pin_path.value, canonical_path, length);
} else {
native_maps[i].pin_path.value = NULL;
native_maps[i].pin_path.length = 0;
Expand Down
13 changes: 13 additions & 0 deletions libs/shared/ebpf_shared_framework.h
Original file line number Diff line number Diff line change
Expand Up @@ -219,4 +219,17 @@ ebpf_result_t
ebpf_duplicate_program_data(
_In_ const ebpf_program_data_t* program_data, _Outptr_ ebpf_program_data_t** new_program_data);

/**
* @brief Canonicalize a path using filesystem canonicalization rules.
*
* @param[out] output Buffer in which to write canonicalized path.
* @param[in] output_size Size of output buffer.
* @param[out] error_code Zero on success, non-zero Win32 error code on failure.
*
* @retval EBPF_SUCCESS The operation was successful.
* @retval EBPF_INVALID_ARGUMENT The input path was invalid.
*/
ebpf_result_t
ebpf_canonicalize_path(_Out_writes_(output_size) char* output, size_t output_size, _In_z_ const char* input);

CXPLAT_EXTERN_C_END
Loading
Loading