|
|
|
#1 |
|
Messages: n/a
Hébergeur: |
On 9/15/07, Konrad Meyer <konrad@tylerc.org> wrote:
> Quoth Ilmari Heikkinen: > > On 9/14/07, Konrad Meyer <konrad@tylerc.org> wrote: > > > Hmm, am I not seeing it (just using 'mdh -p') or can metadata.rb extract > > > stuff like artist, title, album, track, and whatnot from ogg/flac? > > > > It should at least. If you're having trouble, lemme know > > > Yeah, I'm having some trouble. I have latest metadata (0.2). > > [snip] > > Any ideas? Yeah, I failed at using git. Jeez. Sorry about that. Here's 0.3, it oughta work: tarball: http://dark.fhtr.org/repos/metadata/metadata-0.3.tar.gz git: http://dark.fhtr.org/repos/metadata On 9/15/07, darren kirby <bulliver@badcomputer.org> wrote: > Hi Ilmari! > > Just wanted to mention that despite the name, wmainfo will parse anything > wrapped in an ASF audio/video container format[0], so, you could use it to > parse wmv movies as well if your user didn't have mplayer installed. > > [0] http://en.wikipedia.org/wiki/Advanced_Systems_Format > Thanks for the pointer! I made it merge the wmainfo output to the mplayer output for wmv and asf. Description ----------- This package `Metadata' comes with a library called `metadata' and a small program called `mdh'. The library probes files for their metadata (e.g. jpeg dimensions and camera make, mp3 artist, pdf word count) and returns the metadata as a Hash. Mdh can print out file metadata as YAML and package the metadata with the file. This package has many dependencies since there is no single universal metadata header format that all files use. Blame resource forks, filename extensions, bags of bytes and mimetypes. Usage ----- # print out metadata header mdh -p myfile.jpg # create myfile.jpg.mdh, which consists of metadata header + myfile.jpg mdh myfile.jpg # print out metadata header from mdh file mdh -e -p myfile.jpg.mdh # strip out metadata header from mdh file and save it to myfile.jpg mdh -e myfile.jpg.mdh irb> Metadata.extract('myfile.jpg') irb> Metadata.extract_text('myfile.jpg') irb> Pathname.new("myfile.jpg").metadata List of supported formats ------------------------- Audio: Successfully tested with: mp3, flac, ogg, wav Should also work: wma, m4a Video: What you manage to make mplayer play, which can be just about anything. Then again, missing title and author data, etc. (do videos even have those?) Successfully tested with: wmv, mov, divx, xvid, flv, ogm, mpg Images: Should handle pretty much anything (apart from XCF and ORF.) Successfully tested with: jpeg, png, gif, nef, dng, crw, pef, psd Documents: Successfully tested with: pdf, ppt, odp, sxi, ps, ps.gz, html, txt Should work: - OpenOffice docs work to some degree (personally, I'm using unoconv to convert OO docs to temp PDFs for the text & dimensions extraction, so those bits of data are missing.) - MS Office docs to some degree (ppt at least, doc and xls should work too, dimensions missing due to the above temp PDF -thing.) Others: Whatever extract spits out on the five or six bits of metadata I'm using from it. Archive contents at least. Requirements ------------ * Ruby 1.8 * Tons of metadata extraction programs and libs, list of gems: flacinfo-rb wmainfo-rb MP4info list of debian packages: dcraw libimlib2-ruby extract libimage-exiftool-perl poppler-utils mplayer html2text imagemagick unhtml pstotext antiword catdoc shared-mime-info vorbis-tools * You do want to install the latest versions of dcraw and shared-mime-info to be able to handle camera raw images. http://cybercom.net/~dcoffin/dcraw/ http://freedesktop.org/wiki/Software/shared-mime-info * Python + chardet library http://chardet.feedparser.org/ Install ------- De-compress archive and enter its top directory. Then type: ($ su) # ruby setup.rb These simple step installs this program under the default location of Ruby libraries. You can also install files into your favorite directory by supplying setup.rb some options. Try "ruby setup.rb --". License ------- Ruby's -- Ilmari Heikkinen <ilmari.heikkinen gmail com> http://fhtr.blogspot.com |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
--nextPart6653061.QBXH3ZSY6e
Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Quoth Ilmari Heikkinen: > On 9/15/07, Konrad Meyer <konrad@tylerc.org> wrote: > > Quoth Ilmari Heikkinen: > > > On 9/14/07, Konrad Meyer <konrad@tylerc.org> wrote: > > > > Hmm, am I not seeing it (just using 'mdh -p') or can metadata.rb=20 extract > > > > stuff like artist, title, album, track, and whatnot from ogg/flac? > > > > > > It should at least. If you're having trouble, lemme know > > > > > Yeah, I'm having some trouble. I have latest metadata (0.2). > > > > [snip] > > > > Any ideas? >=20 > Yeah, I failed at using git. Jeez. Sorry about that. > Here's 0.3, it oughta work: >=20 > tarball: http://dark.fhtr.org/repos/metadata/metadata-0.3.tar.gz > git: http://dark.fhtr.org/repos/metadata >=20 >=20 > On 9/15/07, darren kirby <bulliver@badcomputer.org> wrote: > > Hi Ilmari! > > > > Just wanted to mention that despite the name, wmainfo will parse anythi= ng > > wrapped in an ASF audio/video container format[0], so, you could use it= to > > parse wmv movies as well if your user didn't have mplayer installed. > > > > [0] http://en.wikipedia.org/wiki/Advanced_Systems_Format > > >=20 > Thanks for the pointer! > I made it merge the wmainfo output to the mplayer output for wmv and asf. >=20 >=20 > Description > ----------- >=20 > This package `Metadata' comes with a library called `metadata' and > a small program called `mdh'. >=20 > The library probes files for their metadata (e.g. jpeg dimensions > and camera make, mp3 artist, pdf word count) and returns the metadata > as a Hash. >=20 > Mdh can print out file metadata as YAML and package the metadata > with the file. >=20 > This package has many dependencies since there is no single universal > metadata header format that all files use. Blame resource forks, filena= me > extensions, bags of bytes and mimetypes. >=20 >=20 > Usage > ----- >=20 > # print out metadata header > mdh -p myfile.jpg >=20 > # create myfile.jpg.mdh, which consists of metadata header + myfile.jpg > mdh myfile.jpg >=20 > # print out metadata header from mdh file > mdh -e -p myfile.jpg.mdh >=20 > # strip out metadata header from mdh file and save it to myfile.jpg > mdh -e myfile.jpg.mdh >=20 > irb> Metadata.extract('myfile.jpg') > irb> Metadata.extract_text('myfile.jpg') > irb> Pathname.new("myfile.jpg").metadata >=20 >=20 > List of supported formats > ------------------------- >=20 > Audio: > Successfully tested with: > mp3, flac, ogg, wav > Should also work: > wma, m4a >=20 > Video: > What you manage to make mplayer play, which can be just about anythin= g. > Then again, missing title and author data, etc. (do videos even have= =20 those?) > Successfully tested with: > wmv, mov, divx, xvid, flv, ogm, mpg >=20 > Images: > Should handle pretty much anything (apart from XCF and ORF.) > Successfully tested with: > jpeg, png, gif, nef, dng, crw, pef, psd >=20 > Documents: > Successfully tested with: > pdf, ppt, odp, sxi, ps, ps.gz, html, txt > Should work: > - OpenOffice docs work to some degree (personally, I'm using unoconv = to > convert OO docs to temp PDFs for the text & dimensions extraction, = so > those bits of data are missing.) > - MS Office docs to some degree (ppt at least, doc and xls should wor= k=20 too, > dimensions missing due to the above temp PDF -thing.) >=20 > Others: > Whatever extract spits out on the five or six bits of metadata I'm us= ing > from it. Archive contents at least. >=20 > Requirements > ------------ >=20 > * Ruby 1.8 >=20 > * Tons of metadata extraction programs and libs, > list of gems: > flacinfo-rb > wmainfo-rb > MP4info > list of debian packages: > dcraw > libimlib2-ruby > extract > libimage-exiftool-perl > poppler-utils > mplayer > html2text > imagemagick > unhtml > pstotext > antiword > catdoc > shared-mime-info > vorbis-tools >=20 > * You do want to install the latest versions of dcraw and > shared-mime-info to be able to handle camera raw images. > http://cybercom.net/~dcoffin/dcraw/ > http://freedesktop.org/wiki/Software/shared-mime-info >=20 > * Python + chardet library > http://chardet.feedparser.org/ >=20 > Install > ------- >=20 > De-compress archive and enter its top directory. > Then type: >=20 > ($ su) > # ruby setup.rb >=20 > These simple step installs this program under the default > location of Ruby libraries. You can also install files into > your favorite directory by supplying setup.rb some options. > Try "ruby setup.rb --". >=20 >=20 > License > ------- >=20 > Ruby's >=20 >=20 > -- > Ilmari Heikkinen <ilmari.heikkinen gmail com> > http://fhtr.blogspot.com Any chance you could wrap this up as a gem? It's not something I care strongly about, and I don't know how complicated the process is, but I think it would ease installation for some users. =2D-=20 Konrad Meyer <konrad@tylerc.org> http://konrad.sobertillnoon.com/ --nextPart6653061.QBXH3ZSY6e Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQBG64q7CHB0oCiR2cwRAo7PAKCd+VEEC1laxOSyE9fgwT Vxty08RgCggAQC Ud/k87bjXEpRm23L2gRyPyA= =qXTO -----END PGP SIGNATURE----- --nextPart6653061.QBXH3ZSY6e-- |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
--nextPart2995463.av4K1aPkjp
Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Quoth Ilmari Heikkinen: > On 9/15/07, Konrad Meyer <konrad@tylerc.org> wrote: > > Quoth Ilmari Heikkinen: > > > On 9/14/07, Konrad Meyer <konrad@tylerc.org> wrote: > > > > Hmm, am I not seeing it (just using 'mdh -p') or can metadata.rb=20 extract > > > > stuff like artist, title, album, track, and whatnot from ogg/flac? > > > > > > It should at least. If you're having trouble, lemme know > > > > > Yeah, I'm having some trouble. I have latest metadata (0.2). > > > > [snip] > > > > Any ideas? >=20 > Yeah, I failed at using git. Jeez. Sorry about that. > Here's 0.3, it oughta work: >=20 > tarball: http://dark.fhtr.org/repos/metadata/metadata-0.3.tar.gz > git: http://dark.fhtr.org/repos/metadata >=20 >=20 > On 9/15/07, darren kirby <bulliver@badcomputer.org> wrote: > > Hi Ilmari! > > > > Just wanted to mention that despite the name, wmainfo will parse anythi= ng > > wrapped in an ASF audio/video container format[0], so, you could use it= to > > parse wmv movies as well if your user didn't have mplayer installed. > > > > [0] http://en.wikipedia.org/wiki/Advanced_Systems_Format > > >=20 > Thanks for the pointer! > I made it merge the wmainfo output to the mplayer output for wmv and asf. >=20 >=20 > Description > ----------- >=20 > This package `Metadata' comes with a library called `metadata' and > a small program called `mdh'. >=20 > The library probes files for their metadata (e.g. jpeg dimensions > and camera make, mp3 artist, pdf word count) and returns the metadata > as a Hash. >=20 > Mdh can print out file metadata as YAML and package the metadata > with the file. >=20 > This package has many dependencies since there is no single universal > metadata header format that all files use. Blame resource forks, filena= me > extensions, bags of bytes and mimetypes. >=20 >=20 > Usage > ----- >=20 > # print out metadata header > mdh -p myfile.jpg >=20 > # create myfile.jpg.mdh, which consists of metadata header + myfile.jpg > mdh myfile.jpg >=20 > # print out metadata header from mdh file > mdh -e -p myfile.jpg.mdh >=20 > # strip out metadata header from mdh file and save it to myfile.jpg > mdh -e myfile.jpg.mdh >=20 > irb> Metadata.extract('myfile.jpg') > irb> Metadata.extract_text('myfile.jpg') > irb> Pathname.new("myfile.jpg").metadata >=20 >=20 > List of supported formats > ------------------------- >=20 > Audio: > Successfully tested with: > mp3, flac, ogg, wav > Should also work: > wma, m4a >=20 > Video: > What you manage to make mplayer play, which can be just about anythin= g. > Then again, missing title and author data, etc. (do videos even have= =20 those?) > Successfully tested with: > wmv, mov, divx, xvid, flv, ogm, mpg >=20 > Images: > Should handle pretty much anything (apart from XCF and ORF.) > Successfully tested with: > jpeg, png, gif, nef, dng, crw, pef, psd >=20 > Documents: > Successfully tested with: > pdf, ppt, odp, sxi, ps, ps.gz, html, txt > Should work: > - OpenOffice docs work to some degree (personally, I'm using unoconv = to > convert OO docs to temp PDFs for the text & dimensions extraction, = so > those bits of data are missing.) > - MS Office docs to some degree (ppt at least, doc and xls should wor= k=20 too, > dimensions missing due to the above temp PDF -thing.) >=20 > Others: > Whatever extract spits out on the five or six bits of metadata I'm us= ing > from it. Archive contents at least. >=20 > Requirements > ------------ >=20 > * Ruby 1.8 >=20 > * Tons of metadata extraction programs and libs, > list of gems: > flacinfo-rb > wmainfo-rb > MP4info > list of debian packages: > dcraw > libimlib2-ruby > extract > libimage-exiftool-perl > poppler-utils > mplayer > html2text > imagemagick > unhtml > pstotext > antiword > catdoc > shared-mime-info > vorbis-tools >=20 > * You do want to install the latest versions of dcraw and > shared-mime-info to be able to handle camera raw images. > http://cybercom.net/~dcoffin/dcraw/ > http://freedesktop.org/wiki/Software/shared-mime-info >=20 > * Python + chardet library > http://chardet.feedparser.org/ >=20 > Install > ------- >=20 > De-compress archive and enter its top directory. > Then type: >=20 > ($ su) > # ruby setup.rb >=20 > These simple step installs this program under the default > location of Ruby libraries. You can also install files into > your favorite directory by supplying setup.rb some options. > Try "ruby setup.rb --". >=20 >=20 > License > ------- >=20 > Ruby's >=20 >=20 > -- > Ilmari Heikkinen <ilmari.heikkinen gmail com> > http://fhtr.blogspot.com Er, I'm still not getting information out of ogg files: $ mdh -p ~/music/bowling_for_soup_-_1985.ogg=20 ---=20 Video.Duration: 192.78 Audio.Samplerate: 44100 Audio.Bitrate: 192.0 Image.DimensionUnit: px Video.Codec: "" File.Size: 4618665 Audio.Codec: vrbs File.Modified: 2007-01-03T22:10:11-08:00 File.Format: video/x-theora+ogg $ mplayer ~/music/bowling_for_soup_-_1985.ogg=20 ... Clip info: Genre: Pop Name: 1985 Artist: Bowling for Soup Creation Date: 2004 Album: A Hangover You Don't Deserve Track: 03 Thanks for your quick responses! =2D-=20 Konrad Meyer <konrad@tylerc.org> http://konrad.sobertillnoon.com/ --nextPart2995463.av4K1aPkjp Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQBG65mRCHB0oCiR2cwRAnVtAJ0cVXoFSnrv0UpZ2lfvGM LJWqxjOACfcymj VXGRvIgXtNAka48TWkAs49o= =dS7r -----END PGP SIGNATURE----- --nextPart2995463.av4K1aPkjp-- |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
On 9/15/07, Konrad Meyer <konrad@tylerc.org> wrote:
> Er, I'm still not getting information out of ogg files: > > $ mdh -p ~/music/bowling_for_soup_-_1985.ogg > --- > Video.Duration: 192.78 > Audio.Samplerate: 44100 > Audio.Bitrate: 192.0 > Image.DimensionUnit: px > Video.Codec: "" > File.Size: 4618665 > Audio.Codec: vrbs > File.Modified: 2007-01-03T22:10:11-08:00 > File.Format: video/x-theora+ogg ^- That's the problem there. It thinks it's a video file. <technical blather> Why? Probably because I hacked the mimetype guesser to _not_ assume things based on the filename extension, and the shared-mime-info db assumes that the guesser _is_ assuming things based on the filename extension. Which is something I'd rather not do with downloaded files (which, by their very nature, have wild disparities between the extension and the real mimetype.) And the header content-type is often totally wrong or doesn't match shared-mime-info's naming (e.g. application/octet-stream vs. image/gif, audio/x-mp3 vs. audio/mpeg, video/divx vs. video/x-msvideo, video/x-ms-asf vs. video/vnd.ms-asf...) And this magic-over-extension sometimes leads to me getting generic lesser-magic guesses instead of more specific filename extension guesses (e.g. zip instead of OO document.) So, I have a list of generic formats that defer to the extension rather than rely on the lesser-magic. Anyhow, it's ugly, hacky magic. Just like the rest of mimetype guessing. </technical blather> But! Fixing this instance of the problem in the next thirty seconds. ... There! And now, adding ogginfo metadata to video/x-theora+ogg. Ok, try this: http://dark.fhtr.org/repos/metadata/metadata-0.4.tar.gz > Thanks for your quick responses! Thanks for the bug reports! They really in making this thing more robust. > Konrad Meyer <konrad@tylerc.org> http://konrad.sobertillnoon.com/ -- Ilmari Heikkinen http://fhtr.blogspot.com |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
On 9/15/07, Konrad Meyer <konrad@tylerc.org> wrote:
> $ mplayer ~/music/bowling_for_soup_-_1985.ogg > ... > Clip info: > Genre: Pop > Name: 1985 > Artist: Bowling for Soup > Creation Date: 2004 > Album: A Hangover You Don't Deserve > Track: 03 Oh, nice, mplayer does give out metadata fields. I better augment the mplayer info parser to grab those ![]() 0.5 here we come! |
|
|
|
#6 |
|
Messages: n/a
Hébergeur: |
--nextPart1969409.vsceX6HOzH
Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Quoth Ilmari Heikkinen: > On 9/15/07, Konrad Meyer <konrad@tylerc.org> wrote: >=20 > > $ mplayer ~/music/bowling_for_soup_-_1985.ogg > > ... > > Clip info: > > Genre: Pop > > Name: 1985 > > Artist: Bowling for Soup > > Creation Date: 2004 > > Album: A Hangover You Don't Deserve > > Track: 03 >=20 > Oh, nice, mplayer does give out metadata fields. I better augment > the mplayer info parser to grab those ![]() >=20 > 0.5 here we come! Another bug (Sorry ):$ mdh -p ~/music/Limp\ Bizkit\ -\ Rollin\'\ \(edited\).ogg=20 sh: -c: line 0: syntax error near unexpected token `(' sh: -c: line 0: `ogginfo '/home/konrad/music/Limp Bizkit - Rollin\' (edited).ogg'' (Last line was broken up to email length.) You're already escaping single quotes for the shell, need to escape start-parens and end-parens as well. Thanks, =2D-=20 Konrad Meyer <konrad@tylerc.org> http://konrad.sobertillnoon.com/ --nextPart1969409.vsceX6HOzH Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQBG7DeUCHB0oCiR2cwRAgfCAJ9JhfSO4kETO7lDswAoWx MtmUaEFwCcCche QslxJUdP2NKvjBuybxcJNx8= =AWKF -----END PGP SIGNATURE----- --nextPart1969409.vsceX6HOzH-- |
|
|
|
#7 |
|
Messages: n/a
Hébergeur: |
--nextPart5838677.qWcsKGLdKI
Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Quoth Ilmari Heikkinen: > On 9/15/07, Konrad Meyer <konrad@tylerc.org> wrote: >=20 > > $ mplayer ~/music/bowling_for_soup_-_1985.ogg > > ... > > Clip info: > > Genre: Pop > > Name: 1985 > > Artist: Bowling for Soup > > Creation Date: 2004 > > Album: A Hangover You Don't Deserve > > Track: 03 >=20 > Oh, nice, mplayer does give out metadata fields. I better augment > the mplayer info parser to grab those ![]() >=20 > 0.5 here we come! Also: =46or mp3 id3v2 tags, the binary string "\xCB\x99\xC5\xA3" is being inserted at the front of all the string fields. $ mdh -p ~/music/Snoop\ Dogg\ -\ Gin\ \&\ Juice.mp3 ---=20 Audio.Album: "\xCB\x99\xC5\xA3Death Row's Snoop Doggy Dogg Greatest Hits (2001)" ... Audio.Genre: "\xCB\x99\xC5\xA3Hip-Hop" Audio.Title: "\xCB\x99\xC5\xA3Gin & Juice" ... Audio.Artist: "\xCB\x99\xC5\xA3Snoop Dogg" I *think* this is an id3v2 thing. Also, it happens in more than one file and amaroK sees the tags "correctly", so I'm thinking it's on the metadata's end. Thanks! =2D-=20 Konrad Meyer <konrad@tylerc.org> http://konrad.sobertillnoon.com/ --nextPart5838677.qWcsKGLdKI Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQBG7DlWCHB0oCiR2cwRAlSmAKDLiG+vPqM9m+ELgshJ26 iXArm3XwCgis+h F1zgbew/iWQuYDX5ccD5YvE= =0DiB -----END PGP SIGNATURE----- --nextPart5838677.qWcsKGLdKI-- |
|
![]() |
| Outils de la discussion | |
|
|