[dry-inflector] Question about underscore and camelize behavior with dashes

Hello, I recently released command_kit, which is a zero-dependency all-in-one kitchen-sink-included kit for building CLI apps. I borrowed some of the functions from dry-inflector since I only needed a very basic inflector for converting file names / command names to Class names. My next goal was to port command_kit to Crystal. However, Crystal does not support Regexp.last_match (which it shouldn’t, because global state), so I refactored the inflector methods to use StringScanner; which improved the performance of underscore but slowed down camelize. Eventually I want to submit a PR back to dry-inflector, however my algorithm deviates slightly from dry-inflector’s original algorithm:

  • underscore reduces multiple _ and -s to a single _ (ex: Foo---Bar___Bazfoo_bar_baz)
  • camelize omits any -s (ex: foo-barFooBar)

This raised the question about what should the ideal behavior be? Should - be preserved, reduced, or entirely omitted when converting a CamelCase string to under_scored or vice versa?

1 Like

Here is an example script to illustrate differences in behavior.

#!/usr/bin/env ruby

  require 'bundler/setup'
rescue LoadError => error
  abort error.message

require 'dry-inflector'
require 'command_kit/inflector'

dry_inflector = Dry::Inflector.new
command_kit = CommandKit::Inflector

camelcase = "Foo__BAR---Baz"
underscore = "foo___bar---baz"

puts "#{camelcase}\t-> DRY::Inflector#underscore        -> #{dry_inflector.underscore(camelcase)}"
puts "#{camelcase}\t-> CommandKit::Inflector#underscore -> #{command_kit.underscore(camelcase)}"


puts "#{underscore}\t-> DRY::Inflector#camelize         -> #{dry_inflector.camelize(underscore)}"
puts "#{underscore}\t-> CommandKit::Inflector#camelize  -> #{command_kit.camelize(underscore)}"
Foo__BAR---Baz	-> DRY::Inflector#underscore        -> foo__bar___baz
Foo__BAR---Baz	-> CommandKit::Inflector#underscore -> foo__bar___baz

foo___bar---baz	-> DRY::Inflector#camelize         -> FooBar---baz
foo___bar---baz	-> CommandKit::Inflector#camelize  -> FooBarBaz

Personally, I think any underscores or dashes should be stripped when converting from under_scored case to CamelCase. Not sure whether - should be preserved or omitted when converting CamelCase to under_scored case?

I think repeated dashes should be reduced to single one and be treated as one.

However, the underscore method seems to work fine without reducing the number of characters when replacing dashes with underscores.

Update: I’ve fixed the regression introduced by my StringScanner algorithm where the number of _ or - characters were not being preserved when converting to under_scored. Lost some of the performance gains, but is still ~15% faster than DRY::Inflector#underscore.