Skip to content

[azure-eventhub] Fix handling of connection strings with entity path #43716

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

zmoog
Copy link
Contributor

@zmoog zmoog commented Apr 6, 2025

Proposed commit message

Fix entity path handling in the event hub connection string (v1 and v2 processors).

For context on this issue, see #43715.

Changes:

  • v1 & v2: fully parse the connection string for validation and to check the event hub name and entity path match.
  • v2: handle event hub name and connection string according to the new Event Hub SDK requirements to support all connection string formats.

Unrelated changes:

  • I had to fix an unrelated problem to metricbeat/module/windows/service/service_integration_test.go to make the linter happy again.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Related issues

@zmoog zmoog self-assigned this Apr 6, 2025
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Apr 6, 2025
Copy link
Contributor

mergify bot commented Apr 6, 2025

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @zmoog? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

@zmoog zmoog added the backport-active-all Automated backport with mergify to all the active branches label Apr 6, 2025
@zmoog zmoog force-pushed the zmoog/fix/azure-eventhub/entity-path branch from 665e226 to 573a462 Compare April 6, 2025 13:29
@zmoog zmoog added Team:obs-ds-hosted-services Label for the Observability Hosted Services team Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team labels Apr 7, 2025
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Apr 7, 2025
@zmoog zmoog marked this pull request as ready for review April 7, 2025 07:02
@zmoog zmoog requested review from a team as code owners April 7, 2025 07:02
@zmoog zmoog requested review from AndersonQ and faec April 7, 2025 07:02
@elasticmachine
Copy link
Collaborator

Pinging @elastic/obs-ds-hosted-services (Team:obs-ds-hosted-services)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@zmoog zmoog changed the title [azure-eventhub] Processor v2: handle connection string with entity path [azure-eventhub] Fix handling of connection strings with entity path Apr 7, 2025
}

keyAndValue := strings.SplitN(split, "=", 2)
if len(keyAndValue) < 2 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Blocker]

Looking at the test above, won't it introduce a breaking change?
In the test the connection string was:

sb://test-ns.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=SECRET

now it's

Endpoint=sb://test-ns.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=SECRET

The former won't pass this check, and if there are configs using the former format, wouldn't it make this a breaking change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous connection string version (starting with sb://test-ns.servicebus.windows.net...) was invalid.

The test was passing because the previous Validate() implementation only checked that the connection string wasn't empty, instead of trying to parse it.

According to the EventHubConnectionStringProperties page:

An Event Hubs connection string is a set of key-value pairs separated by semicolon. A typical example is "Endpoint=sb://foo.EventHub.windows.net/;SharedAccessKeyName=someKeyName;SharedAccessKey=someKeyValue".

A connection may have the following sections:

  • Endpoint, which is mandatory. The fully qualifed namespace of the Event Hubs namespace. It will look similar to "{your-namespace}.servicebus.windows.net"
  • SharedAccessKeyName and SharedAccessKey, optional, used to authenticate the access to the Event Hubs namespace or Event Hub instance.
  • EntityPath, optional, the name of the Event Hub instance.
  • SharedAccessSignature, optional, an alternative way to authenticate the access to an Event Hub instance.

Comment on lines 108 to 129
if csp.Emulator {
// check that they're only connecting to localhost
endpointParts := strings.SplitN(csp.Endpoint, ":", 3) // allow for a port, if it exists.

if len(endpointParts) < 2 || endpointParts[0] != "sb" || endpointParts[1] != "//localhost" {
// there should always be at least two parts "sb:" and "//localhost"
// with an optional 3rd piece that's the port "1111".
// (we don't need to validate it's a valid host since it's been through url.Parse() above)
return ConnectionStringProperties{}, fmt.Errorf("UseDevelopmentEmulator=true can only be used with sb://localhost or sb://localhost:<port number>, not %s", csp.Endpoint)
}
}

if csp.FullyQualifiedNamespace == "" {
return ConnectionStringProperties{}, fmt.Errorf("key %q must not be empty", endpointKey)
}

if csp.SharedAccessSignature == nil && csp.SharedAccessKeyName == nil {
return ConnectionStringProperties{}, fmt.Errorf("key %q must not be empty", sharedAccessKeyNameKey)
}

if csp.SharedAccessKey == nil && csp.SharedAccessSignature == nil {
return ConnectionStringProperties{}, fmt.Errorf("key %q or %q cannot both be empty", sharedAccessKeyKey, sharedAccessSignatureKey)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion]
Perform all checks and accumulate the errors to return them all at once. It's rather frustrating from the user perspective to fix a validation error just to discover there is still another to be fixed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

For context, this function comes from an internal package of the new Event Hub SDK.

As it is internal, I cannot use it directly; I planned to include it as-is — even if I did not like it very much (now even less, after your comment).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good point on accumulating errors. But could be a follow up to improve the error message to reduce scope of the PR :)

Copy link
Contributor Author

@zmoog zmoog Apr 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AndersonQ, does improving the validation in a follow-up PR work for you?

@zmoog zmoog force-pushed the zmoog/fix/azure-eventhub/entity-path branch from 93cb87f to f13635c Compare April 7, 2025 16:13
Copy link
Contributor

@Kavindu-Dodan Kavindu-Dodan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks fine :)

@zmoog zmoog requested a review from AndersonQ April 8, 2025 11:50
The problem is not in the connection string but in the fact that the
entity in the connection string doesn't match the event hub name.
}
// Store the parsed connection string, so we can use it
// later in the input, when needed.
conf.ConnectionStringProperties = connectionStringProperties
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This use of Validate to precache essential data makes me uncomfortable... I don't think it's a bug in this case, since your config struct is private and you can ensure that configs are only generated by explicitly unpacking, but it adds a confusing implicit sequencing dependency. For example I'd expect that a test should be able to call newEventHubInputV2 with a valid config, but the way things are written here it would break unless the config was rewritten as yaml and then manually unpacked first.

I'd prefer that the cached connection string data wasn't in the user config struct at all, but if the field is there, can we at least initialize it in newEventHubInput*? Validation can be done with an uncached local copy (precalculating in this case is for code simplicity not performance, so there's no harm in parsing the string twice).


// if a key value pair has `=` in the value, recombine them
key := keyAndValue[0]
value := strings.Join(keyAndValue[1:], "=")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the Join here superfluous since the SplitN parameter above ensures only the first = produces a split?

return ConnectionStringProperties{}, fmt.Errorf("key %q must not be empty", endpointKey)
}

if csp.SharedAccessSignature == nil && csp.SharedAccessKeyName == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't these be non-nil but still empty strings? (and below)

@zmoog
Copy link
Contributor Author

zmoog commented Apr 14, 2025

Hey @AndersonQ, @faec, thank you so much for reviewing this PR! 🙇

I tried to cut some corners to meet a FF, but it's not worth it since we have a workaround for this issue.

I will rework (1) ConnectionStringProperties cache and (2) our copy of the internal code I copied from the Microsoft libraries to parse the connection string.

Moving the status back to draft. And I'll request a new review when ready.

@zmoog zmoog marked this pull request as draft April 14, 2025 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-active-all Automated backport with mergify to all the active branches bugfix input:azure-eventhub Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team Team:obs-ds-hosted-services Label for the Observability Hosted Services team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[azure-eventhub] Processors v1 & v2 don't handle connection strings with entity path correctly
5 participants