delete indeterminate number of matches in file/folder names

A swapping-ground for Regular Expression syntax

delete indeterminate number of matches in file/folder names

Postby RobEmenecker » Tue Jul 16, 2013 7:22 pm

Hi all,

I have a collection of files that at some point in their journeys had their names URL-encoded. That is, there are a number of occurrences of percent signs followed by two hex characters. I want to universally strip these out of the file and folder names, without putting anything in their place.

For example:
Original: Coke%2BPantry%2BFeb%2B6%2B2013.pptx
New: CokePantryFeb62013.pptx

The regex match pattern is simple: %[0-9A-F]{2}

There can be zero or more matches in a file or folder name and the matches can occur anywhere in the middle of the name and could also be adjacent to one another.

The problem I am running into is that BRU is not allowing me to enter this into the RegEx(1) "Match" and then leave the "Replace" empty, effectively saying, "remove every match from the name."

I have even tried entering an unused character, the caret (^) into the replace field with the notion of doing a second sweep to using Repl.(3) to strip out the caret characters, but this results in the ENTIRE NAME before replaced by a caret character if there is any match in the name.

I'm banging my head into a wall with this. I use regular expressions constantly in Javascript and PHP programming. I understand them. I also LOVE BRU and use it constantly. I've just never had to use it for regular expressions before right now.

I'm hoping this is something obvious that I am overlooking and someone can point out the problem. I'm not enjoying the notion of writing a file/folder renaming utility in PHP to do what I think BRU should be able to do for me.

Thoughts?
RobEmenecker
 
Posts: 3
Joined: Tue Jul 16, 2013 6:56 pm

remove several parts from file/folder names

Postby Stefan » Tue Jul 16, 2013 8:21 pm

RegEx is implemented with the need for us to match always the whole string (filename).
You can store parts for backreferencing to reuse as new name, and drop other parts.
But you can't just search for parts of the filename to remove them, no matter where they appear.
You can only match this parts if you can find a common pattern over all filenames.


So better you should use Repl.(3) or Remove(5)-

To do this for more than one part at once, best utilize "Character Translation" (see help and forum)
Stefan
 
Posts: 736
Joined: Fri Mar 11, 2005 7:46 pm
Location: Germany, EU

Re: delete indeterminate number of matches in file/folder names

Postby RobEmenecker » Tue Jul 16, 2013 9:04 pm

Bugger. I was hoping that I was simply doing something wrong.

Okay, it looks like Character Translations will work for my purpose. There is a fairly predictable finite set of hex character pairs, so I don't have to literally allow for every single combination. It is the usual URL-encoding suspects: ',",+,?,&,etc.

Thanks for the quick response Stefan!
RobEmenecker
 
Posts: 3
Joined: Tue Jul 16, 2013 6:56 pm

Re: delete indeterminate number of matches in file/folder names

Postby RobEmenecker » Tue Jul 16, 2013 9:22 pm

One last question. Does the command line version support character translations?
RobEmenecker
 
Posts: 3
Joined: Tue Jul 16, 2013 6:56 pm


Return to Regular Expressions