Basic Expressions

A swapping-ground for Regular Expression syntax

Basic Expressions

Postby jelee » Thu Jan 09, 2014 11:23 pm

Hey,

I need some help,

if there are four numbers in a filename I want it to be put between brackets;
example
1999 -> (1999)
(1998) -> (1998)
etc

After a certain word I want to get everything taken away
example
name (1999) [DVD] 'remove' -> Name (1999) [DVD]
name (1999) [1080p] 'remove' -> Name (1999) [1080p]

Thanx in advanced!
jelee
 
Posts: 4
Joined: Thu Jan 09, 2014 11:14 pm

Basic Expressions

Postby truth » Fri Jan 10, 2014 4:37 am

Note I'm assuming CertainWord should always exist, & be dropped with remaining text.
The below parenthesizes a final ####, it wont touch (####) or [####*] as in your examples.

(.*[^[(])([0-9]{4})([^)].*)CertainWord.*
\1(\2)\3

Post back if you need something more selective.
truth
 
Posts: 221
Joined: Tue Jun 25, 2013 3:39 am
Location: Earth, OrionArm, MilkyWay

Re: Basic Expressions

Postby jelee » Fri Jan 10, 2014 5:02 am

Thank you for your fast reply, Maby was it a bit unclear using google translate (because me english is not great)

Thank you for the start...

This is what i want (if it is posible)
Old
Star Wars Episode 1 The Phantom Menace 1999 iNT DVD -aNBc
The Air I Breathe 2007 [DVD] iNT-DEViSE
Dead Before Dawn 2012 1080p -PSiG


Step 1:
(.*[^[(])([0-9]{4})([^)].*)
\1(\2)\3

Star Wars Episode 1 The Phantom Menace (1999) iNT DVD -aNBc
The Air I Breathe (2007) [DVD] iNT-DEViSE
Dead Before Dawn (2012) 1080p -PSiG


Wanted end result:
Star Wars Episode 1 The Phantom Menace (1999) [DVD]
The Air I Breathe (2007) [DVD]
Dead Before Dawn (2012) [1080p]
jelee
 
Posts: 4
Joined: Thu Jan 09, 2014 11:14 pm

Basic Expressions

Postby truth » Fri Jan 10, 2014 7:13 am

The 1st regex parenthesizes the last 4 #'s only (not as shown in Dead Before Dawn example)!
You can always just post more OrigFileNames & DesiredRenames for clarity.

Does a 4# year always exist? Is it always in the format: Space####Space ?
What other words (besides DVD) should always get [bracketed] ?
Are all ####p's 4#'s, and never 3 as in ###p ?

Most likely your situation calls for several runs, unless you're willing to go commandline.
Hard to say at this point, much depends on the answers.
truth
 
Posts: 221
Joined: Tue Jun 25, 2013 3:39 am
Location: Earth, OrionArm, MilkyWay

Re: Basic Expressions

Postby jelee » Fri Jan 10, 2014 1:10 pm

Wie is. That.Girl.1987.DVDRip.XviD-Gff
-> Who's That Girl (1987) [DVD]
Crash.2004.DVDRip.XviD.AC3-WAF
-> Crash (2004) [DVD]
G.I.Joe.Retaliation.2013.1080p.XviD-PTpOWeR
-> G I Joe Retaliation (2013) [1080p]
Nightwatch.1994.720p.BluRay.x264-FARGIRENIS [PublicHD]
-> Nightwatch (1994) [720p]
Star Wars Episode 1 The Phantom Menace 1999 iNT DVD -aNBc
-> Star Wars Episode 1 The Phantom Menace (1999) [DVD]
The Air I Breathe 2007 [DVD] iNT-DEViSE
-> The Air I Breathe (2007) [DVD]
Dead Before Dawn 2012 1080p -PSiG
-> Dead Before Dawn (2012) [1080p]


Here i have some more examples.

Does a 4# year always exist?
No it don't, sometimes is there not a year in the title.
Is it always in the format: Space####Space ?
No, there.are.often.points.in.the.title

What other words (besides DVD) should always get [bracketed] ?
720p ->[720p]
1080p -> [1080p]
DVDRip -> [DVD]

No, i'm going to use for the more runs, at commandline i'm TERIBLE, and the GUI shows me what the change is.
jelee
 
Posts: 4
Joined: Thu Jan 09, 2014 11:14 pm

Basic Expressions

Postby truth » Sat Jan 11, 2014 3:53 am

In order to match year when it exists, it must be defined precisely.
Going by your examples, the best run for a year-match currently seems to be:

(.*)[\. ]([0-9]{4})[\. ](.*)
\1 (\2) \3

Use the below Options/CharTranslations to handle your [BracketedWords]
7,2,0,p=[,7,2,0,p,]
1,0,8,0,p=[,1,0,8,0,p,]
D,V,D=[,D,V,D,]
D,V,D,R,i,p=[,D,V,D,]
,i,N,T, = ,
[,[=[
],]=]
.=

Note the ,i,N,T, line should begin with Space (this forum removes leading spaces).
The final comma in that same line isnt needed, it's only to make the space obvious.
I'd also probably use #5Remove CropAfter=] and D/S=Checked

That gives filenames as described so far.
Unfortunately, many filenames may need a 2nd run, & BRU can only process 1 regex-per-rename.
Note that if Wie is is a language translation for Who's, add below into CharTranslation:
W,i,e, ,i,s=W,h,o,',s
Last edited by truth on Sun Jan 12, 2014 5:56 pm, edited 1 time in total.
truth
 
Posts: 221
Joined: Tue Jun 25, 2013 3:39 am
Location: Earth, OrionArm, MilkyWay

Re: Basic Expressions

Postby jelee » Sat Jan 11, 2014 6:02 pm

Thank you alot truth!!

Seeing only some isue's when im checking the things out

Between (date) [and type] the are somethimes standing some text is that easy to remove automatic, or should i put every word in the Character translations?

Is it posible to put a spatie in Character translations? H,D,R,i,p=[,H,i,g,h D,e,f,i,n,i,t,i,o,n,]

Sometimes is the time indicatie different [date] when it need to be (date) That wil give some problems when im using the remove option.
jelee
 
Posts: 4
Joined: Thu Jan 09, 2014 11:14 pm

Basic Expressions

Postby truth » Sun Jan 12, 2014 3:01 am

So long as ALL occurences of StandingText should be removed: Yes, add-it into CharTranslations.
Just give your filenames a good lookover before employing Translations you're not certain of.

Suppose your StandingText=ixdVR & you might have filenames with multiple ixdVR's (that you wish to retain):
Use 12Filter=*ixdVR*ixdVR* to verify whether such filenames exist, its a good pre-checker.

Spaces are input into CharTrans just like other chars (see last sentence of last post)
This last example would be: H,D,R,i,p=[,H,i,g,h, ,D,e,f,i,n,i,t,i,o,n,]

It might be easier to run a 2nd regex to replace )*[ --> )Space[ so all words inbetween ) and [ get removed?
It depends on your filenames & how many 'CharTrans-Words' you may want to input (I save them in text-files).

I misunderstand the last sentence, are you saying that some OrigFileNames also have [year] ??
If so, use the below regex with []'s added as 'year-matchers'
(.*)[[\. ]([0-9]{4})[]\. ](.*)
\1 (\2) \3
truth
 
Posts: 221
Joined: Tue Jun 25, 2013 3:39 am
Location: Earth, OrionArm, MilkyWay

Improved Border-Matching for 4DigitYear

Postby truth » Thu Feb 06, 2014 3:17 am

The last regex matches 4DigitYear when bordered by either a space, period, or bracket.
It can fail with names like: Aaa.2014. 2099.zzz since it can match Space2099. instead of .2014.

To specify that 4DigitYear must be bordered by the exact same char on both sides,
or in your case, also when 4DigitYear can be bordered as [4DigitYear], use the below instead:

1Regex match/replace:
(.*)(([\. ])([0-9]{4})\3|\[([0-9]{4})\])(.*)
\1 (\4\5) \6

It works by using \3 in match to specify a repeating Group3BorderChar after 4DigitYear
In this case, its sub-grouped so OR can also match rare occurrences of [4DigitYear]
This allows either \4 or \5 in replacement to represent 4DigitYear

This regex wont touch filenames like aaa 2014.2014 [2014 2014.zzz, if that's an issue.
truth
 
Posts: 221
Joined: Tue Jun 25, 2013 3:39 am
Location: Earth, OrionArm, MilkyWay


Return to Regular Expressions