d2jsp
Log InRegister
d2jsp Forums > Off-Topic > Computers & IT > Programming & Development > Extracting Images From D2 File
Add Reply New Topic New Poll
Member
Posts: 3,476
Joined: Jul 20 2015
Gold: 651.00
Jul 2 2016 10:01pm
Does anyone know how to get all of the images with the correct colors for all items etc inside the d2 mpq files?

http://www.diablofans.com/forums/diablo-iii-misc-forums/diablo-legacy-forums/diablo-ii/72891-download-diablo-ii-music-cinematics-and-speech

That person was able to do it, but how?
Member
Posts: 14,631
Joined: Sep 14 2006
Gold: 575.56
Jul 2 2016 10:44pm
if you scan the file with a binary reader for jpeg markers that'll probably get you something
Member
Posts: 3,476
Joined: Jul 20 2015
Gold: 651.00
Jul 2 2016 11:09pm
Quote (Ideophobe @ Jul 2 2016 11:44pm)
if you scan the file with a binary reader for jpeg markers that'll probably get you something


all the files in the mpq are dc6 format
Member
Posts: 14,631
Joined: Sep 14 2006
Gold: 575.56
Jul 3 2016 12:52am
then find the markers for those right?
start by looking at some plain dc6 files in a hex editor and find the markers, there will be some start of image and end of image hex strings. get somewhere close with that and just start grabbing chunks and trying to render it

http://www.zezula.net/en/mpq/mpqformat.html

Quote (zezula.net)
For MPQ format version 1, size of the MPQ header is 32 (0x20) bytes.
For MPQ format version 2, size of the MPQ header is 44 (0x2C) bytes.
For MPQ format version 3, size of the MPQ header must be greater or equal to 44 (0x2C) bytes.
For MPQ format version 4, size of the MPQ header is 208 (0xD0) bytes.


Quote (zezula.net)
All files stored in the MPQ are usually saved at position following the MPQ header. This is not mandatory, however, the only known exception is savegames for Diablo I, where hash table and block table follow immediately after MPQ header.


This post was edited by Ideophobe on Jul 3 2016 01:03am
Member
Posts: 13,425
Joined: Sep 29 2007
Gold: 0.00
Warn: 20%
Jul 3 2016 10:04pm
I spent all of 3 minutes googling and found all the tools needed to achieve this.

Go download MPQExtractor and DC6CON. All that's left is to extract the DC6 files as well as the pallet.dat files to color them, or else they will use the default gray scale pallet.

If you are looking to implement your own DC3 to PCX converter you can view dc6con.c for inspiration, although I doubt you will understand it since it works on the byte/bit level.

Quote (Ideophobe @ Jul 3 2016 02:52am)
then find the markers for those right?
start by looking at some plain dc6 files in a hex editor and find the markers, there will be some start of image and end of image hex strings. get somewhere close with that and just start grabbing chunks and trying to render it


This is pretty close to what the OP has to do if they want to implement their own extraction algorithm. Most of the time for images there is a header which contains a magic string for identification, then a list of attributes such as height, width, color depth, etc depending on how complex it is. From there most implementations usually just run the actual data to EOF, but in this case DC6 uses a EOF marker.

For example this is the DC6 header:

Code
typedef struct dc6header_s {
long version; // 06000000
long unknown01; // 01000000
long unknown02; // 00000000
byte termination[4];// EEEEEEEE or CDCDCDCD
long unknown03; // 10000000
long blockcount; // 10000000
// after this, are pointers
long pointer[];
} dc6header;


where there is a standardized version header 0x06, some unknown values (probably some kind of bitmask settings for image editors), the EOF marker, a block count (probably for how many pointer blocks there are) and a pointer to the start of the actual block data.

From there you will extract data at BLOCK_SIZE into another struct which may hold width, hight, some kind of positional data, and the color or pallet markings.

You can compare this to a hex dump of the DC6 file where you can clearly see how the header overlays overtop of the first 24 bytes:



But anyways this is probably all too complex for OP.

This post was edited by AbDuCt on Jul 3 2016 10:15pm
Member
Posts: 14,631
Joined: Sep 14 2006
Gold: 575.56
Jul 4 2016 12:24am
ya i had to do something very similar at work a few weeks ago
010 is the shit not an incredibly fun project though if you're going into it blind, sounds like all the mindless staring at your monitor for hours has been done already

This post was edited by Ideophobe on Jul 4 2016 12:34am
Member
Posts: 13,425
Joined: Sep 29 2007
Gold: 0.00
Warn: 20%
Jul 4 2016 07:29am
It's not incredibly difficult if you have experience and either a live subject or lots of samples.

For instance the same technique is used for overlaying video game memory into a named structure to access it easier. After playing with the live environment watching for variable change it becomes quite obvious which segment of data is what type and belongs to what value.

In case for DC6, even if you didn't have a spec to go by to could quite easily defer the header from multiple files. Running a diff on them will show which segments of data changes and which are fixed. From there I personally would just toss the header out and jump directly into trying to decode the real data. Figuring that out would be a bit more difficult and would likely need to be done by reversing what ever file reads or writes the file in question. Although if the header has required information specific to the decoding of the image data, then that too would have to be reversed anyways rather than being thrown out.

I enjoy doing this kind of reversing on a semi regular basis mostly with network protocols. I find it entertaining.

This post was edited by AbDuCt on Jul 4 2016 07:31am
Member
Posts: 14,631
Joined: Sep 14 2006
Gold: 575.56
Jul 4 2016 12:45pm
i just assumed it would all be big endian but i have no idea, i'm not actually going to look at these files

i don't like it... reverse engineering /parsing binary files is not fun for me and if there is any way around it i'll take it. last one i looked at was taking special 360 degree video files breaking them into frames (sets of 5 jpegs with a header) and adding information to those headers
terrible experience... almost no reference material, and what i had was in completely different formats... it was like 80 hours of staring at a hex editor saying, "i really thought that would work"
Go Back To Programming & Development Topic List
Add Reply New Topic New Poll