Possible places for improvement

May 21, 2012 at 9:21 PM
Edited May 21, 2012 at 9:22 PM

Can you tell us which parts of the code you got from the original Perl, and maybe an idea how authoritative that source was? I ask because I think some areas of the regex could be improved. The numberPattern part, for example. I see ways to decrease unnecessary backtracking, and I'm wondering if I'm missing something that some Perl genius intended to do. Also, is there any reason you haven't used RegexOptions.ExplicitCapture

May 30, 2012 at 2:08 PM

95% is derived from the original Perl source at http://cpansearch.perl.org/src/SDERLE/Geo-StreetAddress-US-0.99/US.pm

I started at the top of the file and translated to C# as I went.

RegexOptions.ExplicitCapture wasn't used presumably because Perl does not support this option, I didn't know it existed, and the original expressions within Perl used the ?: syntax, so I kept that during the translation.

The hardest part of the translation was that the original Perl uses variable assignment inline with the regular expression, something that C# doesn't support. It's certainly possible that some of those non-capturing groups are no longer necessary.

Any feedback is welcome!