Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add custom enum deserializer to improve error messaging, improve byte count error mesages #5076

Merged

Conversation

graytaylor0
Copy link
Member

@graytaylor0 graytaylor0 commented Oct 16, 2024

Description

Currently when enums are deserialized with an invalid value the error will say that the value must not be null, because the enums we use have a @JsonCreator that looks up the options in the map and returns null.

This adds a custom enum deserializer that will give a clearer message and provide the valid options to the user.

Also, the ByteCount error messages were improved to provide valid options and show which field is invalid.

New error message for invalid enum values

  1. entry-pipeline.processor.parse_json: caused by: Parameter "handle_failed_events" for plugin "parse_json" is invalid: Invalid value "skip_silentl". Valid options include [drop, drop_silently, skip, skip_silently].

Issues Resolved

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • New functionality includes testing.
  • New functionality has a documentation issue. Please link to it in this PR.
    • New functionality has javadoc added
  • Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

oeyh
oeyh previously approved these changes Oct 17, 2024
@@ -33,6 +34,7 @@ public class ObjectMapperConfiguration {
ObjectMapper extensionPluginConfigObjectMapper() {
final SimpleModule simpleModule = new SimpleModule();
simpleModule.addDeserializer(Duration.class, new DataPrepperDurationDeserializer());
simpleModule.addDeserializer(Enum.class, new EnumDeserializer());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be applied to every Enum defined in the project?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it will be

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any configuration file that has a variable that is an Enum

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice 👍

Copy link
Contributor

@san81 san81 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super helpful fix for the customers go understand the error message along with listing out possible values for them to fix. 👍

@@ -36,7 +36,8 @@ public DropEventsProcessor(

if (dropEventProcessorConfig.getDropWhen() != null &&
(!expressionEvaluator.isValidExpressionStatement(dropEventProcessorConfig.getDropWhen()))) {
throw new InvalidPluginConfigurationException("drop_when {} is not a valid expression statement. See https://opensearch.org/docs/latest/data-prepper/pipelines/expression-syntax/ for valid expression syntax");
throw new InvalidPluginConfigurationException(String.format("drop_when \"%s\" is not a valid expression statement. " +
"See https://opensearch.org/docs/latest/data-prepper/pipelines/expression-syntax/ for valid expression syntax", dropEventProcessorConfig.getDropWhen()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a concern but just helpful advice for any possible future refactoring needs. What if the documentation url path changes in the future? May be better to reference these urls from a single constants file or something like that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good suggestion we should do that. We have these in other locations as well like the @JsonPropertyDescriptions

@@ -79,11 +79,13 @@ public static ByteCount parse(final String string) {
final String unitString = matcher.group("unit");

if(unitString == null) {
throw new ByteCountInvalidInputException("Byte counts must have a unit.");
throw new ByteCountInvalidInputException("Byte counts must have a unit. Valid byte units include: " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, it is better to enhance the corresponding test case messages to assert for this updated message. Currently, they are only looking for not null.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update the test case to assert for this message

}

final Unit unit = Unit.fromString(unitString)
.orElseThrow(() -> new ByteCountInvalidInputException("Invalid byte unit: '" + unitString + "'"));
.orElseThrow(() -> new ByteCountInvalidInputException("Invalid byte unit: '" + unitString + "'. Valid byte units include: "
+ Arrays.stream(Unit.values()).map(unitValue -> unitValue.unitString).collect(Collectors.toList())));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a nit. This Arrays.stream statement repeated twice. It is also there in line number 83. I think you can add a method in Unit enum class to return this array of code values to keep the logic at one place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 51 already does something similar, though it creates a map which will be out of order. If you make it a linked map you could use the keys() from it.

private static final Map<String, Unit> UNIT_MAP = Arrays.stream(Unit.values())
.collect(Collectors.toMap(unit -> unit.unitString, Function.identity()));

final String enumValue = node.asText();

for (Object enumConstant : enumClass.getEnumConstants()) {
if (enumConstant.toString().equalsIgnoreCase(enumValue)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right approach. We should be sticking with a single case to avoid multiple possible values for the same thing.

Rather, use the core Jackson deserializer to get the value. Or, use @JsonCreator which is what it uses.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, I think that is fine to make something like enumConstant.toString().toLowerCase().equals(enumValue) as a default behavior. But, we should still support @JsonValue and @JsonCreator. So, you could have logic like the following pseudo-code:

if(findAnnotedMethod(enumClass, JsonCreator.class)) {
  return useJsonCreator(enumClass);
} else {
  if( enumConstant.toString().toLowerCase().equals(enumValue) {
    return enumConstant;
  }
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a problem with allowing both lowercase and uppercase? I think it is fair to assume that if the user inputs either, this is the intended choice. I have felt like it is not clear which value to put into the YAML before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid it because it ends up causing exactly this confusion. So rather than be so loose that anything goes, let's just require the correct input. We can consistently take in lowercase-values which is what our documentation tells people to do.

Also, this is so loose that we will allow aLL-soRTs-of-VALues.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can JsonValue just be used for this too? Not sure how we would use JsonCreator for this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now the enums all use @JsonCreator to determine how to deserialize from JSON into POJO. The @JsonValue determines how to reverse this for the schemas. Maybe @JsonValue can do both, but I'm not sure.

This is the pattern:

@JsonCreator
static TargetType fromOptionValue(final String option) {
return OPTIONS_MAP.get(DataType.fromTypeName(option));
}
@JsonValue
public String getOptionValue() {
return dataType.getTypeName();
}

}

throw new IllegalArgumentException(String.format(INVALID_ENUM_VALUE_ERROR_FORMAT, enumValue,
Arrays.stream(enumClass.getEnumConstants()).map(valueEnum -> valueEnum.toString().toLowerCase()).collect(Collectors.toList())));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, let's not assume these are always the same. Use @JsonValue instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we assume that enums will always have a method annotated with @JsonValue, or should we fall back to default lowercase behavior if no method is found?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we shouldn't assume this. Some may be missing. I think falling back is a good solution when there is no @JsonValue.

final String enumValue = node.asText();

final Optional<Method> jsonCreator = findJsonCreatorMethod();
jsonCreator.ifPresent(method -> method.setAccessible(true));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, this is in a try-finally block.

try {
  jsonCreator.ifPresent(method -> method.setAccessible(true));

  // for loop
}
finally {
  jsonCreator.ifPresent(method -> method.setAccessible(false));
}

}

private enum TestEnum {
TEST("test");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test case needs a different name to ensure it works, right?

Suggested change
TEST("test");
TEST("test_display_value");

final EnumDeserializer objectUnderTest = new EnumDeserializer();
JsonDeserializer<?> result = objectUnderTest.createContextual(context, property);

assertTrue(result instanceof EnumDeserializer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assertTrue(result instanceof EnumDeserializer);
assertThat(result, instanceOf(EnumDeserializer.class));

This will output the class type if it fails the assertion.

}

@ParameterizedTest
@EnumSource(HandleFailedEventsOption.class)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rely on test classes to avoid changes in one class affecting others. Maybe update TestEnum with more values.

Signed-off-by: Taylor Gray <[email protected]>
Copy link
Member

@dlvenable dlvenable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @graytaylor0 !

@graytaylor0 graytaylor0 merged commit e9bffee into opensearch-project:main Oct 23, 2024
46 of 47 checks passed
graytaylor0 added a commit to graytaylor0/data-prepper that referenced this pull request Oct 25, 2024
graytaylor0 added a commit to graytaylor0/data-prepper that referenced this pull request Oct 25, 2024
graytaylor0 added a commit to graytaylor0/data-prepper that referenced this pull request Oct 25, 2024
…ove byte count error mesages (opensearch-project#5076)"

This reverts commit e9bffee.
Signed-off-by: Taylor Gray <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants