facebook360_dep
Facebook360 Depth Estimation Pipeline
stb_image.h
Go to the documentation of this file.
1 /* stb_image - v2.12 - public domain image loader - http://nothings.org/stb_image.h
2  no warranty implied; use at your own risk
3 
4  Do this:
5  #define STB_IMAGE_IMPLEMENTATION
6  before you include this file in *one* C or C++ file to create the implementation.
7 
8  // i.e. it should look like this:
9  #include ...
10  #include ...
11  #include ...
12  #define STB_IMAGE_IMPLEMENTATION
13 #include "stb_image.h"
14 
15  You can #define STBI_ASSERT(x) before the #include to avoid using assert.h.
16  And #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free
17 
18 
19  QUICK NOTES:
20  Primarily of interest to game developers and other people who can
21  avoid problematic images and only need the trivial interface
22 
23  JPEG baseline & progressive (12 bpc/arithmetic not supported, same as stock IJG lib)
24  PNG 1/2/4/8-bit-per-channel (16 bpc not supported)
25 
26  TGA (not sure what subset, if a subset)
27  BMP non-1bpp, non-RLE
28  PSD (composited view only, no extra channels, 8/16 bit-per-channel)
29 
30  GIF (*comp always reports as 4-channel)
31  HDR (radiance rgbE format)
32  PIC (Softimage PIC)
33  PNM (PPM and PGM binary only)
34 
35  Animated GIF still needs a proper API, but here's one way to do it:
36  http://gist.github.com/urraka/685d9a6340b26b830d49
37 
38  - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
39  - decode from arbitrary I/O callbacks
40  - SIMD acceleration on x86/x64 (SSE2) and ARM (NEON)
41 
42  Full documentation under "DOCUMENTATION" below.
43 
44 
45  Revision 2.00 release notes:
46 
47  - Progressive JPEG is now supported.
48 
49  - PPM and PGM binary formats are now supported, thanks to Ken Miller.
50 
51  - x86 platforms now make use of SSE2 SIMD instructions for
52  JPEG decoding, and ARM platforms can use NEON SIMD if requested.
53  This work was done by Fabian "ryg" Giesen. SSE2 is used by
54  default, but NEON must be enabled explicitly; see docs.
55 
56  With other JPEG optimizations included in this version, we see
57  2x speedup on a JPEG on an x86 machine, and a 1.5x speedup
58  on a JPEG on an ARM machine, relative to previous versions of this
59  library. The same results will not obtain for all JPGs and for all
60  x86/ARM machines. (Note that progressive JPEGs are significantly
61  slower to decode than regular JPEGs.) This doesn't mean that this
62  is the fastest JPEG decoder in the land; rather, it brings it
63  closer to parity with standard libraries. If you want the fastest
64  decode, look elsewhere. (See "Philosophy" section of docs below.)
65 
66  See final bullet items below for more info on SIMD.
67 
68  - Added STBI_MALLOC, STBI_REALLOC, and STBI_FREE macros for replacing
69  the memory allocator. Unlike other STBI libraries, these macros don't
70  support a context parameter, so if you need to pass a context in to
71  the allocator, you'll have to store it in a global or a thread-local
72  variable.
73 
74  - Split existing STBI_NO_HDR flag into two flags, STBI_NO_HDR and
75  STBI_NO_LINEAR.
76  STBI_NO_HDR: suppress implementation of .hdr reader format
77  STBI_NO_LINEAR: suppress high-dynamic-range light-linear float API
78 
79  - You can suppress implementation of any of the decoders to reduce
80  your code footprint by #defining one or more of the following
81  symbols before creating the implementation.
82 
83  STBI_NO_JPEG
84  STBI_NO_PNG
85  STBI_NO_BMP
86  STBI_NO_PSD
87  STBI_NO_TGA
88  STBI_NO_GIF
89  STBI_NO_HDR
90  STBI_NO_PIC
91  STBI_NO_PNM (.ppm and .pgm)
92 
93  - You can request *only* certain decoders and suppress all other ones
94  (this will be more forward-compatible, as addition of new decoders
95  doesn't require you to disable them explicitly):
96 
97  STBI_ONLY_JPEG
98  STBI_ONLY_PNG
99  STBI_ONLY_BMP
100  STBI_ONLY_PSD
101  STBI_ONLY_TGA
102  STBI_ONLY_GIF
103  STBI_ONLY_HDR
104  STBI_ONLY_PIC
105  STBI_ONLY_PNM (.ppm and .pgm)
106 
107  Note that you can define multiples of these, and you will get all
108  of them ("only x" and "only y" is interpreted to mean "only x&y").
109 
110  - If you use STBI_NO_PNG (or _ONLY_ without PNG), and you still
111  want the zlib decoder to be available, #define STBI_SUPPORT_ZLIB
112 
113  - Compilation of all SIMD code can be suppressed with
114  #define STBI_NO_SIMD
115  It should not be necessary to disable SIMD unless you have issues
116  compiling (e.g. using an x86 compiler which doesn't support SSE
117  intrinsics or that doesn't support the method used to detect
118  SSE2 support at run-time), and even those can be reported as
119  bugs so I can refine the built-in compile-time checking to be
120  smarter.
121 
122  - The old STBI_SIMD system which allowed installing a user-defined
123  IDCT etc. has been removed. If you need this, don't upgrade. My
124  assumption is that almost nobody was doing this, and those who
125  were will find the built-in SIMD more satisfactory anyway.
126 
127  - RGB values computed for JPEG images are slightly different from
128  previous versions of stb_image. (This is due to using less
129  integer precision in SIMD.) The C code has been adjusted so
130  that the same RGB values will be computed regardless of whether
131  SIMD support is available, so your app should always produce
132  consistent results. But these results are slightly different from
133  previous versions. (Specifically, about 3% of available YCbCr values
134  will compute different RGB results from pre-1.49 versions by +-1;
135  most of the deviating values are one smaller in the G channel.)
136 
137  - If you must produce consistent results with previous versions of
138  stb_image, #define STBI_JPEG_OLD and you will get the same results
139  you used to; however, you will not get the SIMD speedups for
140  the YCbCr-to-RGB conversion step (although you should still see
141  significant JPEG speedup from the other changes).
142 
143  Please note that STBI_JPEG_OLD is a temporary feature; it will be
144  removed in future versions of the library. It is only intended for
145  near-term back-compatibility use.
146 
147 
148  Latest revision history:
149  2.12 (2016-04-02) fix typo in 2.11 PSD fix that caused crashes
150  2.11 (2016-04-02) 16-bit PNGS; enable SSE2 in non-gcc x64
151  RGB-format JPEG; remove white matting in PSD;
152  allocate large structures on the stack;
153  correct channel count for PNG & BMP
154  2.10 (2016-01-22) avoid warning introduced in 2.09
155  2.09 (2016-01-16) 16-bit TGA; comments in PNM files; STBI_REALLOC_SIZED
156  2.08 (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
157  2.07 (2015-09-13) partial animated GIF support
158  limited 16-bit PSD support
159  minor bugs, code cleanup, and compiler warnings
160  2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
161  2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
162  2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
163  2.03 (2015-04-12) additional corruption checking
164  stbi_set_flip_vertically_on_load
165  fix NEON support; fix mingw support
166  2.02 (2015-01-19) fix incorrect assert, fix warning
167  2.01 (2015-01-17) fix various warnings
168  2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
169  2.00 (2014-12-25) optimize JPEG, including x86 SSE2 & ARM NEON SIMD
170  progressive JPEG
171  PGM/PPM support
172  STBI_MALLOC,STBI_REALLOC,STBI_FREE
173  STBI_NO_*, STBI_ONLY_*
174  GIF bugfix
175 
176  See end of file for full revision history.
177 
178 
179  ============================ Contributors =========================
180 
181  Image formats Extensions, features
182  Sean Barrett (jpeg, png, bmp) Jetro Lauha (stbi_info)
183  Nicolas Schulz (hdr, psd) Martin "SpartanJ" Golini (stbi_info)
184  Jonathan Dummer (tga) James "moose2000" Brown (iPhone PNG)
185  Jean-Marc Lienher (gif) Ben "Disch" Wenger (io callbacks)
186  Tom Seddon (pic) Omar Cornut (1/2/4-bit PNG)
187  Thatcher Ulrich (psd) Nicolas Guillemot (vertical flip)
188  Ken Miller (pgm, ppm) Richard Mitton (16-bit PSD)
189  urraka@github (animated gif) Junggon Kim (PNM comments)
190  Daniel Gibson (16-bit TGA)
191 
192  Optimizations & bugfixes
193  Fabian "ryg" Giesen
194  Arseny Kapoulkine
195 
196  Bug & warning fixes
197  Marc LeBlanc David Woo Guillaume George Martins Mozeiko
198  Christpher Lloyd Martin Golini Jerry Jansson Joseph Thomson
199  Dave Moore Roy Eltham Hayaki Saito Phil Jordan
200  Won Chun Luke Graham Johan Duparc Nathan Reed
201  the Horde3D community Thomas Ruf Ronny Chevalier Nick Verigakis
202  Janez Zemva John Bartholomew Michal Cichon svdijk@github
203  Jonathan Blow Ken Hamada Tero Hanninen Baldur Karlsson
204  Laurent Gomila Cort Stratton Sergio Gonzalez romigrou@github
205  Aruelien Pocheville Thibault Reuille Cass Everitt Matthew Gregan
206  Ryamond Barbiero Paul Du Bois Engin Manap snagar@github
207  Michaelangel007@github Oriol Ferrer Mesia socks-the-fox
208  Blazej Dariusz Roszkowski
209 
210 
211 LICENSE
212 
213 This software is dual-licensed to the public domain and under the following
214 license: you are granted a perpetual, irrevocable license to copy, modify,
215 publish, and distribute this file as you see fit.
216 
217 */
218 
219 #ifndef STBI_INCLUDE_STB_IMAGE_H
220 #define STBI_INCLUDE_STB_IMAGE_H
221 
222 // DOCUMENTATION
223 //
224 // Limitations:
225 // - no 16-bit-per-channel PNG
226 // - no 12-bit-per-channel JPEG
227 // - no JPEGs with arithmetic coding
228 // - no 1-bit BMP
229 // - GIF always returns *comp=4
230 //
231 // Basic usage (see HDR discussion below for HDR usage):
232 // int x,y,n;
233 // unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
234 // // ... process data if not NULL ...
235 // // ... x = width, y = height, n = # 8-bit components per pixel ...
236 // // ... replace '0' with '1'..'4' to force that many components per pixel
237 // // ... but 'n' will always be the number that it would have been if you said 0
238 // stbi_image_free(data)
239 //
240 // Standard parameters:
241 // int *x -- outputs image width in pixels
242 // int *y -- outputs image height in pixels
243 // int *comp -- outputs # of image components in image file
244 // int req_comp -- if non-zero, # of image components requested in result
245 //
246 // The return value from an image loader is an 'unsigned char *' which points
247 // to the pixel data, or NULL on an allocation failure or if the image is
248 // corrupt or invalid. The pixel data consists of *y scanlines of *x pixels,
249 // with each pixel consisting of N interleaved 8-bit components; the first
250 // pixel pointed to is top-left-most in the image. There is no padding between
251 // image scanlines or between pixels, regardless of format. The number of
252 // components N is 'req_comp' if req_comp is non-zero, or *comp otherwise.
253 // If req_comp is non-zero, *comp has the number of components that _would_
254 // have been output otherwise. E.g. if you set req_comp to 4, you will always
255 // get RGBA output, but you can check *comp to see if it's trivially opaque
256 // because e.g. there were only 3 channels in the source image.
257 //
258 // An output image with N components has the following components interleaved
259 // in this order in each pixel:
260 //
261 // N=#comp components
262 // 1 grey
263 // 2 grey, alpha
264 // 3 red, green, blue
265 // 4 red, green, blue, alpha
266 //
267 // If image loading fails for any reason, the return value will be NULL,
268 // and *x, *y, *comp will be unchanged. The function stbi_failure_reason()
269 // can be queried for an extremely brief, end-user unfriendly explanation
270 // of why the load failed. Define STBI_NO_FAILURE_STRINGS to avoid
271 // compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
272 // more user-friendly ones.
273 //
274 // Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
275 //
276 // ===========================================================================
277 //
278 // Philosophy
279 //
280 // stb libraries are designed with the following priorities:
281 //
282 // 1. easy to use
283 // 2. easy to maintain
284 // 3. good performance
285 //
286 // Sometimes I let "good performance" creep up in priority over "easy to maintain",
287 // and for best performance I may provide less-easy-to-use APIs that give higher
288 // performance, in addition to the easy to use ones. Nevertheless, it's important
289 // to keep in mind that from the standpoint of you, a client of this library,
290 // all you care about is #1 and #3, and stb libraries do not emphasize #3 above all.
291 //
292 // Some secondary priorities arise directly from the first two, some of which
293 // make more explicit reasons why performance can't be emphasized.
294 //
295 // - Portable ("ease of use")
296 // - Small footprint ("easy to maintain")
297 // - No dependencies ("ease of use")
298 //
299 // ===========================================================================
300 //
301 // I/O callbacks
302 //
303 // I/O callbacks allow you to read from arbitrary sources, like packaged
304 // files or some other source. Data read from callbacks are processed
305 // through a small internal buffer (currently 128 bytes) to try to reduce
306 // overhead.
307 //
308 // The three functions you must define are "read" (reads some bytes of data),
309 // "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
310 //
311 // ===========================================================================
312 //
313 // SIMD support
314 //
315 // The JPEG decoder will try to automatically use SIMD kernels on x86 when
316 // supported by the compiler. For ARM Neon support, you must explicitly
317 // request it.
318 //
319 // (The old do-it-yourself SIMD API is no longer supported in the current
320 // code.)
321 //
322 // On x86, SSE2 will automatically be used when available based on a run-time
323 // test; if not, the generic C versions are used as a fall-back. On ARM targets,
324 // the typical path is to have separate builds for NEON and non-NEON devices
325 // (at least this is true for iOS and Android). Therefore, the NEON support is
326 // toggled by a build flag: define STBI_NEON to get NEON loops.
327 //
328 // The output of the JPEG decoder is slightly different from versions where
329 // SIMD support was introduced (that is, for versions before 1.49). The
330 // difference is only +-1 in the 8-bit RGB channels, and only on a small
331 // fraction of pixels. You can force the pre-1.49 behavior by defining
332 // STBI_JPEG_OLD, but this will disable some of the SIMD decoding path
333 // and hence cost some performance.
334 //
335 // If for some reason you do not want to use any of SIMD code, or if
336 // you have issues compiling it, you can disable it entirely by
337 // defining STBI_NO_SIMD.
338 //
339 // ===========================================================================
340 //
341 // HDR image support (disable by defining STBI_NO_HDR)
342 //
343 // stb_image now supports loading HDR images in general, and currently
344 // the Radiance .HDR file format, although the support is provided
345 // generically. You can still load any file through the existing interface;
346 // if you attempt to load an HDR file, it will be automatically remapped to
347 // LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
348 // both of these constants can be reconfigured through this interface:
349 //
350 // stbi_hdr_to_ldr_gamma(2.2f);
351 // stbi_hdr_to_ldr_scale(1.0f);
352 //
353 // (note, do not use _inverse_ constants; stbi_image will invert them
354 // appropriately).
355 //
356 // Additionally, there is a new, parallel interface for loading files as
357 // (linear) floats to preserve the full dynamic range:
358 //
359 // float *data = stbi_loadf(filename, &x, &y, &n, 0);
360 //
361 // If you load LDR images through this interface, those images will
362 // be promoted to floating point values, run through the inverse of
363 // constants corresponding to the above:
364 //
365 // stbi_ldr_to_hdr_scale(1.0f);
366 // stbi_ldr_to_hdr_gamma(2.2f);
367 //
368 // Finally, given a filename (or an open file or memory block--see header
369 // file for details) containing image data, you can query for the "most
370 // appropriate" interface to use (that is, whether the image is HDR or
371 // not), using:
372 //
373 // stbi_is_hdr(char *filename);
374 //
375 // ===========================================================================
376 //
377 // iPhone PNG support:
378 //
379 // By default we convert iphone-formatted PNGs back to RGB, even though
380 // they are internally encoded differently. You can disable this conversion
381 // by by calling stbi_convert_iphone_png_to_rgb(0), in which case
382 // you will always just get the native iphone "format" through (which
383 // is BGR stored in RGB).
384 //
385 // Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
386 // pixel to remove any premultiplied alpha *only* if the image file explicitly
387 // says there's premultiplied data (currently only happens in iPhone images,
388 // and only if iPhone convert-to-rgb processing is on).
389 //
390 
391 
392 #ifndef STBI_NO_STDIO
393 #include <stdio.h>
394 #endif // STBI_NO_STDIO
395 
396 #define STBI_VERSION 1
397 
398 enum
399 {
400  STBI_default = 0, // only used for req_comp
401 
404  STBI_rgb = 3,
406 };
407 
408 typedef unsigned char stbi_uc;
409 
410 #ifdef __cplusplus
411 extern "C" {
412 #endif
413 
414 #ifdef STB_IMAGE_STATIC
415 #define STBIDEF static
416 #else
417 #define STBIDEF extern
418 #endif
419 
421 //
422 // PRIMARY API - works on images of any type
423 //
424 
425 //
426 // load image by filename, open file, or memory buffer
427 //
428 
429 typedef struct
430 {
431  int (*read) (void *user,char *data,int size); // fill 'data' with 'size' bytes. return number of bytes actually read
432  void (*skip) (void *user,int n); // skip the next 'n' bytes, or 'unget' the last -n bytes if negative
433  int (*eof) (void *user); // returns nonzero if we are at end of file/data
435 
436 STBIDEF stbi_uc *stbi_load (char const *filename, int *x, int *y, int *comp, int req_comp);
437 STBIDEF stbi_uc *stbi_load_from_memory (stbi_uc const *buffer, int len , int *x, int *y, int *comp, int req_comp);
438 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk , void *user, int *x, int *y, int *comp, int req_comp);
439 
440 #ifndef STBI_NO_STDIO
441 STBIDEF stbi_uc *stbi_load_from_file (FILE *f, int *x, int *y, int *comp, int req_comp);
442 // for stbi_load_from_file, file pointer is left pointing immediately after image
443 #endif
444 
445 #ifndef STBI_NO_LINEAR
446  STBIDEF float *stbi_loadf (char const *filename, int *x, int *y, int *comp, int req_comp);
447  STBIDEF float *stbi_loadf_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
448  STBIDEF float *stbi_loadf_from_callbacks (stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp);
449 
450  #ifndef STBI_NO_STDIO
451  STBIDEF float *stbi_loadf_from_file (FILE *f, int *x, int *y, int *comp, int req_comp);
452  #endif
453 #endif
454 
455 #ifndef STBI_NO_HDR
456  STBIDEF void stbi_hdr_to_ldr_gamma(float gamma);
457  STBIDEF void stbi_hdr_to_ldr_scale(float scale);
458 #endif // STBI_NO_HDR
459 
460 #ifndef STBI_NO_LINEAR
461  STBIDEF void stbi_ldr_to_hdr_gamma(float gamma);
462  STBIDEF void stbi_ldr_to_hdr_scale(float scale);
463 #endif // STBI_NO_LINEAR
464 
465 // stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR
466 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user);
467 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len);
468 #ifndef STBI_NO_STDIO
469 STBIDEF int stbi_is_hdr (char const *filename);
470 STBIDEF int stbi_is_hdr_from_file(FILE *f);
471 #endif // STBI_NO_STDIO
472 
473 
474 // get a VERY brief reason for failure
475 // NOT THREADSAFE
476 STBIDEF const char *stbi_failure_reason (void);
477 
478 // free the loaded image -- this is just free()
479 STBIDEF void stbi_image_free (void *retval_from_stbi_load);
480 
481 // get image dimensions & components without fully decoding
482 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
483 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp);
484 
485 #ifndef STBI_NO_STDIO
486 STBIDEF int stbi_info (char const *filename, int *x, int *y, int *comp);
487 STBIDEF int stbi_info_from_file (FILE *f, int *x, int *y, int *comp);
488 
489 #endif
490 
491 
492 
493 // for image formats that explicitly notate that they have premultiplied alpha,
494 // we just return the colors as stored in the file. set this flag to force
495 // unpremultiplication. results are undefined if the unpremultiply overflow.
496 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply);
497 
498 // indicate whether we should process iphone images back to canonical format,
499 // or just pass them through "as-is"
500 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert);
501 
502 // flip the image vertically, so the first pixel in the output array is the bottom left
503 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip);
504 
505 // ZLIB client - used by PNG, available for other purposes
506 
507 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen);
508 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header);
509 STBIDEF char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen);
510 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
511 
512 STBIDEF char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen);
513 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
514 
515 
516 #ifdef __cplusplus
517 }
518 #endif
519 
520 //
521 //
523 #endif // STBI_INCLUDE_STB_IMAGE_H
524 
525 #ifdef STB_IMAGE_IMPLEMENTATION
526 
527 #if defined(STBI_ONLY_JPEG) || defined(STBI_ONLY_PNG) || defined(STBI_ONLY_BMP) \
528  || defined(STBI_ONLY_TGA) || defined(STBI_ONLY_GIF) || defined(STBI_ONLY_PSD) \
529  || defined(STBI_ONLY_HDR) || defined(STBI_ONLY_PIC) || defined(STBI_ONLY_PNM) \
530  || defined(STBI_ONLY_ZLIB)
531  #ifndef STBI_ONLY_JPEG
532  #define STBI_NO_JPEG
533  #endif
534  #ifndef STBI_ONLY_PNG
535  #define STBI_NO_PNG
536  #endif
537  #ifndef STBI_ONLY_BMP
538  #define STBI_NO_BMP
539  #endif
540  #ifndef STBI_ONLY_PSD
541  #define STBI_NO_PSD
542  #endif
543  #ifndef STBI_ONLY_TGA
544  #define STBI_NO_TGA
545  #endif
546  #ifndef STBI_ONLY_GIF
547  #define STBI_NO_GIF
548  #endif
549  #ifndef STBI_ONLY_HDR
550  #define STBI_NO_HDR
551  #endif
552  #ifndef STBI_ONLY_PIC
553  #define STBI_NO_PIC
554  #endif
555  #ifndef STBI_ONLY_PNM
556  #define STBI_NO_PNM
557  #endif
558 #endif
559 
560 #if defined(STBI_NO_PNG) && !defined(STBI_SUPPORT_ZLIB) && !defined(STBI_NO_ZLIB)
561 #define STBI_NO_ZLIB
562 #endif
563 
564 
565 #include <stdarg.h>
566 #include <stddef.h> // ptrdiff_t on osx
567 #include <stdlib.h>
568 #include <string.h>
569 
570 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
571 #include <math.h> // ldexp
572 #endif
573 
574 #ifndef STBI_NO_STDIO
575 #include <stdio.h>
576 #endif
577 
578 #ifndef STBI_ASSERT
579 #include <assert.h>
580 #define STBI_ASSERT(x) assert(x)
581 #endif
582 
583 
584 #ifndef _MSC_VER
585  #ifdef __cplusplus
586  #define stbi_inline inline
587  #else
588  #define stbi_inline
589  #endif
590 #else
591  #define stbi_inline __forceinline
592 #endif
593 
594 
595 #ifdef _MSC_VER
596 typedef unsigned short stbi__uint16;
597 typedef signed short stbi__int16;
598 typedef unsigned int stbi__uint32;
599 typedef signed int stbi__int32;
600 #else
601 #include <stdint.h>
602 typedef uint16_t stbi__uint16;
603 typedef int16_t stbi__int16;
604 typedef uint32_t stbi__uint32;
605 typedef int32_t stbi__int32;
606 #endif
607 
608 // should produce compiler error if size is wrong
609 typedef unsigned char validate_uint32[sizeof(stbi__uint32)==4 ? 1 : -1];
610 
611 #ifdef _MSC_VER
612 #define STBI_NOTUSED(v) (void)(v)
613 #else
614 #define STBI_NOTUSED(v) (void)sizeof(v)
615 #endif
616 
617 #ifdef _MSC_VER
618 #define STBI_HAS_LROTL
619 #endif
620 
621 #ifdef STBI_HAS_LROTL
622  #define stbi_lrot(x,y) _lrotl(x,y)
623 #else
624  #define stbi_lrot(x,y) (((x) << (y)) | ((x) >> (32 - (y))))
625 #endif
626 
627 #if defined(STBI_MALLOC) && defined(STBI_FREE) && (defined(STBI_REALLOC) || defined(STBI_REALLOC_SIZED))
628 // ok
629 #elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC) && !defined(STBI_REALLOC_SIZED)
630 // ok
631 #else
632 #error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC (or STBI_REALLOC_SIZED)."
633 #endif
634 
635 #ifndef STBI_MALLOC
636 #define STBI_MALLOC(sz) malloc(sz)
637 #define STBI_REALLOC(p,newsz) realloc(p,newsz)
638 #define STBI_FREE(p) free(p)
639 #endif
640 
641 #ifndef STBI_REALLOC_SIZED
642 #define STBI_REALLOC_SIZED(p,oldsz,newsz) STBI_REALLOC(p,newsz)
643 #endif
644 
645 // x86/x64 detection
646 #if defined(__x86_64__) || defined(_M_X64)
647 #define STBI__X64_TARGET
648 #elif defined(__i386) || defined(_M_IX86)
649 #define STBI__X86_TARGET
650 #endif
651 
652 #if defined(__GNUC__) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET)) && !defined(__SSE2__) && !defined(STBI_NO_SIMD)
653 // NOTE: not clear do we actually need this for the 64-bit path?
654 // gcc doesn't support sse2 intrinsics unless you compile with -msse2,
655 // (but compiling with -msse2 allows the compiler to use SSE2 everywhere;
656 // this is just broken and gcc are jerks for not fixing it properly
657 // http://www.virtualdub.org/blog/pivot/entry.php?id=363 )
658 #define STBI_NO_SIMD
659 #endif
660 
661 #if defined(__MINGW32__) && defined(STBI__X86_TARGET) && !defined(STBI_MINGW_ENABLE_SSE2) && !defined(STBI_NO_SIMD)
662 // Note that __MINGW32__ doesn't actually mean 32-bit, so we have to avoid STBI__X64_TARGET
663 //
664 // 32-bit MinGW wants ESP to be 16-byte aligned, but this is not in the
665 // Windows ABI and VC++ as well as Windows DLLs don't maintain that invariant.
666 // As a result, enabling SSE2 on 32-bit MinGW is dangerous when not
667 // simultaneously enabling "-mstackrealign".
668 //
669 // See https://github.com/nothings/stb/issues/81 for more information.
670 //
671 // So default to no SSE2 on 32-bit MinGW. If you've read this far and added
672 // -mstackrealign to your build settings, feel free to #define STBI_MINGW_ENABLE_SSE2.
673 #define STBI_NO_SIMD
674 #endif
675 
676 #if !defined(STBI_NO_SIMD) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET))
677 #define STBI_SSE2
678 #include <emmintrin.h>
679 
680 #ifdef _MSC_VER
681 
682 #if _MSC_VER >= 1400 // not VC6
683 #include <intrin.h> // __cpuid
684 static int stbi__cpuid3(void)
685 {
686  int info[4];
687  __cpuid(info,1);
688  return info[3];
689 }
690 #else
691 static int stbi__cpuid3(void)
692 {
693  int res;
694  __asm {
695  mov eax,1
696  cpuid
697  mov res,edx
698  }
699  return res;
700 }
701 #endif
702 
703 #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
704 
705 static int stbi__sse2_available()
706 {
707  int info3 = stbi__cpuid3();
708  return ((info3 >> 26) & 1) != 0;
709 }
710 #else // assume GCC-style if not VC++
711 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
712 
713 static int stbi__sse2_available()
714 {
715 #if defined(__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__) >= 408 // GCC 4.8 or later
716  // GCC 4.8+ has a nice way to do this
717  return __builtin_cpu_supports("sse2");
718 #else
719  // portable way to do this, preferably without using GCC inline ASM?
720  // just bail for now.
721  return 0;
722 #endif
723 }
724 #endif
725 #endif
726 
727 // ARM NEON
728 #if defined(STBI_NO_SIMD) && defined(STBI_NEON)
729 #undef STBI_NEON
730 #endif
731 
732 #ifdef STBI_NEON
733 #include <arm_neon.h>
734 // assume GCC or Clang on ARM targets
735 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
736 #endif
737 
738 #ifndef STBI_SIMD_ALIGN
739 #define STBI_SIMD_ALIGN(type, name) type name
740 #endif
741 
743 //
744 // stbi__context struct and start_xxx functions
745 
746 // stbi__context structure is our basic context used by all images, so it
747 // contains all the IO context, plus some basic image information
748 typedef struct
749 {
750  stbi__uint32 img_x, img_y;
751  int img_n, img_out_n;
752 
754  void *io_user_data;
755 
756  int read_from_callbacks;
757  int buflen;
758  stbi_uc buffer_start[128];
759 
760  stbi_uc *img_buffer, *img_buffer_end;
761  stbi_uc *img_buffer_original, *img_buffer_original_end;
762 } stbi__context;
763 
764 
765 static void stbi__refill_buffer(stbi__context *s);
766 
767 // initialize a memory-decode context
768 static void stbi__start_mem(stbi__context *s, stbi_uc const *buffer, int len)
769 {
770  s->io.read = NULL;
771  s->read_from_callbacks = 0;
772  s->img_buffer = s->img_buffer_original = (stbi_uc *) buffer;
773  s->img_buffer_end = s->img_buffer_original_end = (stbi_uc *) buffer+len;
774 }
775 
776 // initialize a callback-based context
777 static void stbi__start_callbacks(stbi__context *s, stbi_io_callbacks *c, void *user)
778 {
779  s->io = *c;
780  s->io_user_data = user;
781  s->buflen = sizeof(s->buffer_start);
782  s->read_from_callbacks = 1;
783  s->img_buffer_original = s->buffer_start;
784  stbi__refill_buffer(s);
785  s->img_buffer_original_end = s->img_buffer_end;
786 }
787 
788 #ifndef STBI_NO_STDIO
789 
790 static int stbi__stdio_read(void *user, char *data, int size)
791 {
792  return (int) fread(data,1,size,(FILE*) user);
793 }
794 
795 static void stbi__stdio_skip(void *user, int n)
796 {
797  fseek((FILE*) user, n, SEEK_CUR);
798 }
799 
800 static int stbi__stdio_eof(void *user)
801 {
802  return feof((FILE*) user);
803 }
804 
805 static stbi_io_callbacks stbi__stdio_callbacks =
806 {
807  stbi__stdio_read,
808  stbi__stdio_skip,
809  stbi__stdio_eof,
810 };
811 
812 static void stbi__start_file(stbi__context *s, FILE *f)
813 {
814  stbi__start_callbacks(s, &stbi__stdio_callbacks, (void *) f);
815 }
816 
817 //static void stop_file(stbi__context *s) { }
818 
819 #endif // !STBI_NO_STDIO
820 
821 static void stbi__rewind(stbi__context *s)
822 {
823  // conceptually rewind SHOULD rewind to the beginning of the stream,
824  // but we just rewind to the beginning of the initial buffer, because
825  // we only use it after doing 'test', which only ever looks at at most 92 bytes
826  s->img_buffer = s->img_buffer_original;
827  s->img_buffer_end = s->img_buffer_original_end;
828 }
829 
830 #ifndef STBI_NO_JPEG
831 static int stbi__jpeg_test(stbi__context *s);
832 static stbi_uc *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
833 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp);
834 #endif
835 
836 #ifndef STBI_NO_PNG
837 static int stbi__png_test(stbi__context *s);
838 static stbi_uc *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
839 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp);
840 #endif
841 
842 #ifndef STBI_NO_BMP
843 static int stbi__bmp_test(stbi__context *s);
844 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
845 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp);
846 #endif
847 
848 #ifndef STBI_NO_TGA
849 static int stbi__tga_test(stbi__context *s);
850 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
851 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp);
852 #endif
853 
854 #ifndef STBI_NO_PSD
855 static int stbi__psd_test(stbi__context *s);
856 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
857 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp);
858 #endif
859 
860 #ifndef STBI_NO_HDR
861 static int stbi__hdr_test(stbi__context *s);
862 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
863 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp);
864 #endif
865 
866 #ifndef STBI_NO_PIC
867 static int stbi__pic_test(stbi__context *s);
868 static stbi_uc *stbi__pic_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
869 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp);
870 #endif
871 
872 #ifndef STBI_NO_GIF
873 static int stbi__gif_test(stbi__context *s);
874 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
875 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp);
876 #endif
877 
878 #ifndef STBI_NO_PNM
879 static int stbi__pnm_test(stbi__context *s);
880 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
881 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp);
882 #endif
883 
884 // this is not threadsafe
885 static const char *stbi__g_failure_reason;
886 
887 STBIDEF const char *stbi_failure_reason(void)
888 {
889  return stbi__g_failure_reason;
890 }
891 
892 static int stbi__err(const char *str)
893 {
894  stbi__g_failure_reason = str;
895  return 0;
896 }
897 
898 static void *stbi__malloc(size_t size)
899 {
900  return STBI_MALLOC(size);
901 }
902 
903 // stbi__err - error
904 // stbi__errpf - error returning pointer to float
905 // stbi__errpuc - error returning pointer to unsigned char
906 
907 #ifdef STBI_NO_FAILURE_STRINGS
908  #define stbi__err(x,y) 0
909 #elif defined(STBI_FAILURE_USERMSG)
910  #define stbi__err(x,y) stbi__err(y)
911 #else
912  #define stbi__err(x,y) stbi__err(x)
913 #endif
914 
915 #define stbi__errpf(x,y) ((float *)(size_t) (stbi__err(x,y)?NULL:NULL))
916 #define stbi__errpuc(x,y) ((unsigned char *)(size_t) (stbi__err(x,y)?NULL:NULL))
917 
918 STBIDEF void stbi_image_free(void *retval_from_stbi_load)
919 {
920  STBI_FREE(retval_from_stbi_load);
921 }
922 
923 #ifndef STBI_NO_LINEAR
924 static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp);
925 #endif
926 
927 #ifndef STBI_NO_HDR
928 static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp);
929 #endif
930 
931 static int stbi__vertically_flip_on_load = 0;
932 
933 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)
934 {
935  stbi__vertically_flip_on_load = flag_true_if_should_flip;
936 }
937 
938 static unsigned char *stbi__load_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
939 {
940  #ifndef STBI_NO_JPEG
941  if (stbi__jpeg_test(s)) return stbi__jpeg_load(s,x,y,comp,req_comp);
942  #endif
943  #ifndef STBI_NO_PNG
944  if (stbi__png_test(s)) return stbi__png_load(s,x,y,comp,req_comp);
945  #endif
946  #ifndef STBI_NO_BMP
947  if (stbi__bmp_test(s)) return stbi__bmp_load(s,x,y,comp,req_comp);
948  #endif
949  #ifndef STBI_NO_GIF
950  if (stbi__gif_test(s)) return stbi__gif_load(s,x,y,comp,req_comp);
951  #endif
952  #ifndef STBI_NO_PSD
953  if (stbi__psd_test(s)) return stbi__psd_load(s,x,y,comp,req_comp);
954  #endif
955  #ifndef STBI_NO_PIC
956  if (stbi__pic_test(s)) return stbi__pic_load(s,x,y,comp,req_comp);
957  #endif
958  #ifndef STBI_NO_PNM
959  if (stbi__pnm_test(s)) return stbi__pnm_load(s,x,y,comp,req_comp);
960  #endif
961 
962  #ifndef STBI_NO_HDR
963  if (stbi__hdr_test(s)) {
964  float *hdr = stbi__hdr_load(s, x,y,comp,req_comp);
965  return stbi__hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
966  }
967  #endif
968 
969  #ifndef STBI_NO_TGA
970  // test tga last because it's a crappy test!
971  if (stbi__tga_test(s))
972  return stbi__tga_load(s,x,y,comp,req_comp);
973  #endif
974 
975  return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt");
976 }
977 
978 static unsigned char *stbi__load_flip(stbi__context *s, int *x, int *y, int *comp, int req_comp)
979 {
980  unsigned char *result = stbi__load_main(s, x, y, comp, req_comp);
981 
982  if (stbi__vertically_flip_on_load && result != NULL) {
983  int w = *x, h = *y;
984  int depth = req_comp ? req_comp : *comp;
985  int row,col,z;
986  stbi_uc temp;
987 
988  // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
989  for (row = 0; row < (h>>1); row++) {
990  for (col = 0; col < w; col++) {
991  for (z = 0; z < depth; z++) {
992  temp = result[(row * w + col) * depth + z];
993  result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
994  result[((h - row - 1) * w + col) * depth + z] = temp;
995  }
996  }
997  }
998  }
999 
1000  return result;
1001 }
1002 
1003 #ifndef STBI_NO_HDR
1004 static void stbi__float_postprocess(float *result, int *x, int *y, int *comp, int req_comp)
1005 {
1006  if (stbi__vertically_flip_on_load && result != NULL) {
1007  int w = *x, h = *y;
1008  int depth = req_comp ? req_comp : *comp;
1009  int row,col,z;
1010  float temp;
1011 
1012  // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
1013  for (row = 0; row < (h>>1); row++) {
1014  for (col = 0; col < w; col++) {
1015  for (z = 0; z < depth; z++) {
1016  temp = result[(row * w + col) * depth + z];
1017  result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
1018  result[((h - row - 1) * w + col) * depth + z] = temp;
1019  }
1020  }
1021  }
1022  }
1023 }
1024 #endif
1025 
1026 #ifndef STBI_NO_STDIO
1027 
1028 static FILE *stbi__fopen(char const *filename, char const *mode)
1029 {
1030  FILE *f;
1031 #if defined(_MSC_VER) && _MSC_VER >= 1400
1032  if (0 != fopen_s(&f, filename, mode))
1033  f=0;
1034 #else
1035  f = fopen(filename, mode);
1036 #endif
1037  return f;
1038 }
1039 
1040 
1041 STBIDEF stbi_uc *stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp)
1042 {
1043  FILE *f = stbi__fopen(filename, "rb");
1044  unsigned char *result;
1045  if (!f) return stbi__errpuc("can't fopen", "Unable to open file");
1046  result = stbi_load_from_file(f,x,y,comp,req_comp);
1047  fclose(f);
1048  return result;
1049 }
1050 
1051 STBIDEF stbi_uc *stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1052 {
1053  unsigned char *result;
1054  stbi__context s;
1055  stbi__start_file(&s,f);
1056  result = stbi__load_flip(&s,x,y,comp,req_comp);
1057  if (result) {
1058  // need to 'unget' all the characters in the IO buffer
1059  fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
1060  }
1061  return result;
1062 }
1063 #endif
1064 
1065 STBIDEF stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1066 {
1067  stbi__context s;
1068  stbi__start_mem(&s,buffer,len);
1069  return stbi__load_flip(&s,x,y,comp,req_comp);
1070 }
1071 
1072 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1073 {
1074  stbi__context s;
1075  stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1076  return stbi__load_flip(&s,x,y,comp,req_comp);
1077 }
1078 
1079 #ifndef STBI_NO_LINEAR
1080 static float *stbi__loadf_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
1081 {
1082  unsigned char *data;
1083  #ifndef STBI_NO_HDR
1084  if (stbi__hdr_test(s)) {
1085  float *hdr_data = stbi__hdr_load(s,x,y,comp,req_comp);
1086  if (hdr_data)
1087  stbi__float_postprocess(hdr_data,x,y,comp,req_comp);
1088  return hdr_data;
1089  }
1090  #endif
1091  data = stbi__load_flip(s, x, y, comp, req_comp);
1092  if (data)
1093  return stbi__ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
1094  return stbi__errpf("unknown image type", "Image not of any known type, or corrupt");
1095 }
1096 
1097 STBIDEF float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1098 {
1099  stbi__context s;
1100  stbi__start_mem(&s,buffer,len);
1101  return stbi__loadf_main(&s,x,y,comp,req_comp);
1102 }
1103 
1104 STBIDEF float *stbi_loadf_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1105 {
1106  stbi__context s;
1107  stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1108  return stbi__loadf_main(&s,x,y,comp,req_comp);
1109 }
1110 
1111 #ifndef STBI_NO_STDIO
1112 STBIDEF float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
1113 {
1114  float *result;
1115  FILE *f = stbi__fopen(filename, "rb");
1116  if (!f) return stbi__errpf("can't fopen", "Unable to open file");
1117  result = stbi_loadf_from_file(f,x,y,comp,req_comp);
1118  fclose(f);
1119  return result;
1120 }
1121 
1122 STBIDEF float *stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1123 {
1124  stbi__context s;
1125  stbi__start_file(&s,f);
1126  return stbi__loadf_main(&s,x,y,comp,req_comp);
1127 }
1128 #endif // !STBI_NO_STDIO
1129 
1130 #endif // !STBI_NO_LINEAR
1131 
1132 // these is-hdr-or-not is defined independent of whether STBI_NO_LINEAR is
1133 // defined, for API simplicity; if STBI_NO_LINEAR is defined, it always
1134 // reports false!
1135 
1136 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len)
1137 {
1138  #ifndef STBI_NO_HDR
1139  stbi__context s;
1140  stbi__start_mem(&s,buffer,len);
1141  return stbi__hdr_test(&s);
1142  #else
1143  STBI_NOTUSED(buffer);
1144  STBI_NOTUSED(len);
1145  return 0;
1146  #endif
1147 }
1148 
1149 #ifndef STBI_NO_STDIO
1150 STBIDEF int stbi_is_hdr (char const *filename)
1151 {
1152  FILE *f = stbi__fopen(filename, "rb");
1153  int result=0;
1154  if (f) {
1155  result = stbi_is_hdr_from_file(f);
1156  fclose(f);
1157  }
1158  return result;
1159 }
1160 
1161 STBIDEF int stbi_is_hdr_from_file(FILE *f)
1162 {
1163  #ifndef STBI_NO_HDR
1164  stbi__context s;
1165  stbi__start_file(&s,f);
1166  return stbi__hdr_test(&s);
1167  #else
1168  STBI_NOTUSED(f);
1169  return 0;
1170  #endif
1171 }
1172 #endif // !STBI_NO_STDIO
1173 
1174 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user)
1175 {
1176  #ifndef STBI_NO_HDR
1177  stbi__context s;
1178  stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1179  return stbi__hdr_test(&s);
1180  #else
1181  STBI_NOTUSED(clbk);
1182  STBI_NOTUSED(user);
1183  return 0;
1184  #endif
1185 }
1186 
1187 #ifndef STBI_NO_LINEAR
1188 static float stbi__l2h_gamma=2.2f, stbi__l2h_scale=1.0f;
1189 
1190 STBIDEF void stbi_ldr_to_hdr_gamma(float gamma) { stbi__l2h_gamma = gamma; }
1191 STBIDEF void stbi_ldr_to_hdr_scale(float scale) { stbi__l2h_scale = scale; }
1192 #endif
1193 
1194 static float stbi__h2l_gamma_i=1.0f/2.2f, stbi__h2l_scale_i=1.0f;
1195 
1196 STBIDEF void stbi_hdr_to_ldr_gamma(float gamma) { stbi__h2l_gamma_i = 1/gamma; }
1197 STBIDEF void stbi_hdr_to_ldr_scale(float scale) { stbi__h2l_scale_i = 1/scale; }
1198 
1199 
1201 //
1202 // Common code used by all image loaders
1203 //
1204 
1205 enum
1206 {
1207  STBI__SCAN_load=0,
1208  STBI__SCAN_type,
1209  STBI__SCAN_header
1210 };
1211 
1212 static void stbi__refill_buffer(stbi__context *s)
1213 {
1214  int n = (s->io.read)(s->io_user_data,(char*)s->buffer_start,s->buflen);
1215  if (n == 0) {
1216  // at end of file, treat same as if from memory, but need to handle case
1217  // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file
1218  s->read_from_callbacks = 0;
1219  s->img_buffer = s->buffer_start;
1220  s->img_buffer_end = s->buffer_start+1;
1221  *s->img_buffer = 0;
1222  } else {
1223  s->img_buffer = s->buffer_start;
1224  s->img_buffer_end = s->buffer_start + n;
1225  }
1226 }
1227 
1228 stbi_inline static stbi_uc stbi__get8(stbi__context *s)
1229 {
1230  if (s->img_buffer < s->img_buffer_end)
1231  return *s->img_buffer++;
1232  if (s->read_from_callbacks) {
1233  stbi__refill_buffer(s);
1234  return *s->img_buffer++;
1235  }
1236  return 0;
1237 }
1238 
1239 stbi_inline static int stbi__at_eof(stbi__context *s)
1240 {
1241  if (s->io.read) {
1242  if (!(s->io.eof)(s->io_user_data)) return 0;
1243  // if feof() is true, check if buffer = end
1244  // special case: we've only got the special 0 character at the end
1245  if (s->read_from_callbacks == 0) return 1;
1246  }
1247 
1248  return s->img_buffer >= s->img_buffer_end;
1249 }
1250 
1251 static void stbi__skip(stbi__context *s, int n)
1252 {
1253  if (n < 0) {
1254  s->img_buffer = s->img_buffer_end;
1255  return;
1256  }
1257  if (s->io.read) {
1258  int blen = (int) (s->img_buffer_end - s->img_buffer);
1259  if (blen < n) {
1260  s->img_buffer = s->img_buffer_end;
1261  (s->io.skip)(s->io_user_data, n - blen);
1262  return;
1263  }
1264  }
1265  s->img_buffer += n;
1266 }
1267 
1268 static int stbi__getn(stbi__context *s, stbi_uc *buffer, int n)
1269 {
1270  if (s->io.read) {
1271  int blen = (int) (s->img_buffer_end - s->img_buffer);
1272  if (blen < n) {
1273  int res, count;
1274 
1275  memcpy(buffer, s->img_buffer, blen);
1276 
1277  count = (s->io.read)(s->io_user_data, (char*) buffer + blen, n - blen);
1278  res = (count == (n-blen));
1279  s->img_buffer = s->img_buffer_end;
1280  return res;
1281  }
1282  }
1283 
1284  if (s->img_buffer+n <= s->img_buffer_end) {
1285  memcpy(buffer, s->img_buffer, n);
1286  s->img_buffer += n;
1287  return 1;
1288  } else
1289  return 0;
1290 }
1291 
1292 static int stbi__get16be(stbi__context *s)
1293 {
1294  int z = stbi__get8(s);
1295  return (z << 8) + stbi__get8(s);
1296 }
1297 
1298 static stbi__uint32 stbi__get32be(stbi__context *s)
1299 {
1300  stbi__uint32 z = stbi__get16be(s);
1301  return (z << 16) + stbi__get16be(s);
1302 }
1303 
1304 #if defined(STBI_NO_BMP) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF)
1305 // nothing
1306 #else
1307 static int stbi__get16le(stbi__context *s)
1308 {
1309  int z = stbi__get8(s);
1310  return z + (stbi__get8(s) << 8);
1311 }
1312 #endif
1313 
1314 #ifndef STBI_NO_BMP
1315 static stbi__uint32 stbi__get32le(stbi__context *s)
1316 {
1317  stbi__uint32 z = stbi__get16le(s);
1318  return z + (stbi__get16le(s) << 16);
1319 }
1320 #endif
1321 
1322 #define STBI__BYTECAST(x) ((stbi_uc) ((x) & 255)) // truncate int to byte without warnings
1323 
1324 
1326 //
1327 // generic converter from built-in img_n to req_comp
1328 // individual types do this automatically as much as possible (e.g. jpeg
1329 // does all cases internally since it needs to colorspace convert anyway,
1330 // and it never has alpha, so very few cases ). png can automatically
1331 // interleave an alpha=255 channel, but falls back to this for other cases
1332 //
1333 // assume data buffer is malloced, so malloc a new one and free that one
1334 // only failure mode is malloc failing
1335 
1336 static stbi_uc stbi__compute_y(int r, int g, int b)
1337 {
1338  return (stbi_uc) (((r*77) + (g*150) + (29*b)) >> 8);
1339 }
1340 
1341 static unsigned char *stbi__convert_format(unsigned char *data, int img_n, int req_comp, unsigned int x, unsigned int y)
1342 {
1343  int i,j;
1344  unsigned char *good;
1345 
1346  if (req_comp == img_n) return data;
1347  STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
1348 
1349  good = (unsigned char *) stbi__malloc(req_comp * x * y);
1350  if (good == NULL) {
1351  STBI_FREE(data);
1352  return stbi__errpuc("outofmem", "Out of memory");
1353  }
1354 
1355  for (j=0; j < (int) y; ++j) {
1356  unsigned char *src = data + j * x * img_n ;
1357  unsigned char *dest = good + j * x * req_comp;
1358 
1359  #define COMBO(a,b) ((a)*8+(b))
1360  #define CASE(a,b) case COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
1361  // convert source image with img_n components to one with req_comp components;
1362  // avoid switch per pixel, so use switch per scanline and massive macros
1363  switch (COMBO(img_n, req_comp)) {
1364  CASE(1,2) dest[0]=src[0], dest[1]=255; break;
1365  CASE(1,3) dest[0]=dest[1]=dest[2]=src[0]; break;
1366  CASE(1,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=255; break;
1367  CASE(2,1) dest[0]=src[0]; break;
1368  CASE(2,3) dest[0]=dest[1]=dest[2]=src[0]; break;
1369  CASE(2,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=src[1]; break;
1370  CASE(3,4) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2],dest[3]=255; break;
1371  CASE(3,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
1372  CASE(3,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = 255; break;
1373  CASE(4,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
1374  CASE(4,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = src[3]; break;
1375  CASE(4,3) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2]; break;
1376  default: STBI_ASSERT(0);
1377  }
1378  #undef CASE
1379  }
1380 
1381  STBI_FREE(data);
1382  return good;
1383 }
1384 
1385 #ifndef STBI_NO_LINEAR
1386 static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp)
1387 {
1388  int i,k,n;
1389  float *output = (float *) stbi__malloc(x * y * comp * sizeof(float));
1390  if (output == NULL) { STBI_FREE(data); return stbi__errpf("outofmem", "Out of memory"); }
1391  // compute number of non-alpha components
1392  if (comp & 1) n = comp; else n = comp-1;
1393  for (i=0; i < x*y; ++i) {
1394  for (k=0; k < n; ++k) {
1395  output[i*comp + k] = (float) (pow(data[i*comp+k]/255.0f, stbi__l2h_gamma) * stbi__l2h_scale);
1396  }
1397  if (k < comp) output[i*comp + k] = data[i*comp+k]/255.0f;
1398  }
1399  STBI_FREE(data);
1400  return output;
1401 }
1402 #endif
1403 
1404 #ifndef STBI_NO_HDR
1405 #define stbi__float2int(x) ((int) (x))
1406 static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp)
1407 {
1408  int i,k,n;
1409  stbi_uc *output = (stbi_uc *) stbi__malloc(x * y * comp);
1410  if (output == NULL) { STBI_FREE(data); return stbi__errpuc("outofmem", "Out of memory"); }
1411  // compute number of non-alpha components
1412  if (comp & 1) n = comp; else n = comp-1;
1413  for (i=0; i < x*y; ++i) {
1414  for (k=0; k < n; ++k) {
1415  float z = (float) pow(data[i*comp+k]*stbi__h2l_scale_i, stbi__h2l_gamma_i) * 255 + 0.5f;
1416  if (z < 0) z = 0;
1417  if (z > 255) z = 255;
1418  output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1419  }
1420  if (k < comp) {
1421  float z = data[i*comp+k] * 255 + 0.5f;
1422  if (z < 0) z = 0;
1423  if (z > 255) z = 255;
1424  output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1425  }
1426  }
1427  STBI_FREE(data);
1428  return output;
1429 }
1430 #endif
1431 
1433 //
1434 // "baseline" JPEG/JFIF decoder
1435 //
1436 // simple implementation
1437 // - doesn't support delayed output of y-dimension
1438 // - simple interface (only one output format: 8-bit interleaved RGB)
1439 // - doesn't try to recover corrupt jpegs
1440 // - doesn't allow partial loading, loading multiple at once
1441 // - still fast on x86 (copying globals into locals doesn't help x86)
1442 // - allocates lots of intermediate memory (full size of all components)
1443 // - non-interleaved case requires this anyway
1444 // - allows good upsampling (see next)
1445 // high-quality
1446 // - upsampled channels are bilinearly interpolated, even across blocks
1447 // - quality integer IDCT derived from IJG's 'slow'
1448 // performance
1449 // - fast huffman; reasonable integer IDCT
1450 // - some SIMD kernels for common paths on targets with SSE2/NEON
1451 // - uses a lot of intermediate memory, could cache poorly
1452 
1453 #ifndef STBI_NO_JPEG
1454 
1455 // huffman decoding acceleration
1456 #define FAST_BITS 9 // larger handles more cases; smaller stomps less cache
1457 
1458 typedef struct
1459 {
1460  stbi_uc fast[1 << FAST_BITS];
1461  // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
1462  stbi__uint16 code[256];
1463  stbi_uc values[256];
1464  stbi_uc size[257];
1465  unsigned int maxcode[18];
1466  int delta[17]; // old 'firstsymbol' - old 'firstcode'
1467 } stbi__huffman;
1468 
1469 typedef struct
1470 {
1471  stbi__context *s;
1472  stbi__huffman huff_dc[4];
1473  stbi__huffman huff_ac[4];
1474  stbi_uc dequant[4][64];
1475  stbi__int16 fast_ac[4][1 << FAST_BITS];
1476 
1477 // sizes for components, interleaved MCUs
1478  int img_h_max, img_v_max;
1479  int img_mcu_x, img_mcu_y;
1480  int img_mcu_w, img_mcu_h;
1481 
1482 // definition of jpeg image component
1483  struct
1484  {
1485  int id;
1486  int h,v;
1487  int tq;
1488  int hd,ha;
1489  int dc_pred;
1490 
1491  int x,y,w2,h2;
1492  stbi_uc *data;
1493  void *raw_data, *raw_coeff;
1494  stbi_uc *linebuf;
1495  short *coeff; // progressive only
1496  int coeff_w, coeff_h; // number of 8x8 coefficient blocks
1497  } img_comp[4];
1498 
1499  stbi__uint32 code_buffer; // jpeg entropy-coded buffer
1500  int code_bits; // number of valid bits
1501  unsigned char marker; // marker seen while filling entropy buffer
1502  int nomore; // flag if we saw a marker so must stop
1503 
1504  int progressive;
1505  int spec_start;
1506  int spec_end;
1507  int succ_high;
1508  int succ_low;
1509  int eob_run;
1510  int rgb;
1511 
1512  int scan_n, order[4];
1513  int restart_interval, todo;
1514 
1515 // kernels
1516  void (*idct_block_kernel)(stbi_uc *out, int out_stride, short data[64]);
1517  void (*YCbCr_to_RGB_kernel)(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step);
1518  stbi_uc *(*resample_row_hv_2_kernel)(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs);
1519 } stbi__jpeg;
1520 
1521 static int stbi__build_huffman(stbi__huffman *h, int *count)
1522 {
1523  int i,j,k=0,code;
1524  // build size list for each symbol (from JPEG spec)
1525  for (i=0; i < 16; ++i)
1526  for (j=0; j < count[i]; ++j)
1527  h->size[k++] = (stbi_uc) (i+1);
1528  h->size[k] = 0;
1529 
1530  // compute actual symbols (from jpeg spec)
1531  code = 0;
1532  k = 0;
1533  for(j=1; j <= 16; ++j) {
1534  // compute delta to add to code to compute symbol id
1535  h->delta[j] = k - code;
1536  if (h->size[k] == j) {
1537  while (h->size[k] == j)
1538  h->code[k++] = (stbi__uint16) (code++);
1539  if (code-1 >= (1 << j)) return stbi__err("bad code lengths","Corrupt JPEG");
1540  }
1541  // compute largest code + 1 for this size, preshifted as needed later
1542  h->maxcode[j] = code << (16-j);
1543  code <<= 1;
1544  }
1545  h->maxcode[j] = 0xffffffff;
1546 
1547  // build non-spec acceleration table; 255 is flag for not-accelerated
1548  memset(h->fast, 255, 1 << FAST_BITS);
1549  for (i=0; i < k; ++i) {
1550  int s = h->size[i];
1551  if (s <= FAST_BITS) {
1552  int c = h->code[i] << (FAST_BITS-s);
1553  int m = 1 << (FAST_BITS-s);
1554  for (j=0; j < m; ++j) {
1555  h->fast[c+j] = (stbi_uc) i;
1556  }
1557  }
1558  }
1559  return 1;
1560 }
1561 
1562 // build a table that decodes both magnitude and value of small ACs in
1563 // one go.
1564 static void stbi__build_fast_ac(stbi__int16 *fast_ac, stbi__huffman *h)
1565 {
1566  int i;
1567  for (i=0; i < (1 << FAST_BITS); ++i) {
1568  stbi_uc fast = h->fast[i];
1569  fast_ac[i] = 0;
1570  if (fast < 255) {
1571  int rs = h->values[fast];
1572  int run = (rs >> 4) & 15;
1573  int magbits = rs & 15;
1574  int len = h->size[fast];
1575 
1576  if (magbits && len + magbits <= FAST_BITS) {
1577  // magnitude code followed by receive_extend code
1578  int k = ((i << len) & ((1 << FAST_BITS) - 1)) >> (FAST_BITS - magbits);
1579  int m = 1 << (magbits - 1);
1580  if (k < m) k += (-1 << magbits) + 1;
1581  // if the result is small enough, we can fit it in fast_ac table
1582  if (k >= -128 && k <= 127)
1583  fast_ac[i] = (stbi__int16) ((k << 8) + (run << 4) + (len + magbits));
1584  }
1585  }
1586  }
1587 }
1588 
1589 static void stbi__grow_buffer_unsafe(stbi__jpeg *j)
1590 {
1591  do {
1592  int b = j->nomore ? 0 : stbi__get8(j->s);
1593  if (b == 0xff) {
1594  int c = stbi__get8(j->s);
1595  if (c != 0) {
1596  j->marker = (unsigned char) c;
1597  j->nomore = 1;
1598  return;
1599  }
1600  }
1601  j->code_buffer |= b << (24 - j->code_bits);
1602  j->code_bits += 8;
1603  } while (j->code_bits <= 24);
1604 }
1605 
1606 // (1 << n) - 1
1607 static stbi__uint32 stbi__bmask[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
1608 
1609 // decode a jpeg huffman value from the bitstream
1610 stbi_inline static int stbi__jpeg_huff_decode(stbi__jpeg *j, stbi__huffman *h)
1611 {
1612  unsigned int temp;
1613  int c,k;
1614 
1615  if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1616 
1617  // look at the top FAST_BITS and determine what symbol ID it is,
1618  // if the code is <= FAST_BITS
1619  c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1620  k = h->fast[c];
1621  if (k < 255) {
1622  int s = h->size[k];
1623  if (s > j->code_bits)
1624  return -1;
1625  j->code_buffer <<= s;
1626  j->code_bits -= s;
1627  return h->values[k];
1628  }
1629 
1630  // naive test is to shift the code_buffer down so k bits are
1631  // valid, then test against maxcode. To speed this up, we've
1632  // preshifted maxcode left so that it has (16-k) 0s at the
1633  // end; in other words, regardless of the number of bits, it
1634  // wants to be compared against something shifted to have 16;
1635  // that way we don't need to shift inside the loop.
1636  temp = j->code_buffer >> 16;
1637  for (k=FAST_BITS+1 ; ; ++k)
1638  if (temp < h->maxcode[k])
1639  break;
1640  if (k == 17) {
1641  // error! code not found
1642  j->code_bits -= 16;
1643  return -1;
1644  }
1645 
1646  if (k > j->code_bits)
1647  return -1;
1648 
1649  // convert the huffman code to the symbol id
1650  c = ((j->code_buffer >> (32 - k)) & stbi__bmask[k]) + h->delta[k];
1651  STBI_ASSERT((((j->code_buffer) >> (32 - h->size[c])) & stbi__bmask[h->size[c]]) == h->code[c]);
1652 
1653  // convert the id to a symbol
1654  j->code_bits -= k;
1655  j->code_buffer <<= k;
1656  return h->values[c];
1657 }
1658 
1659 // bias[n] = (-1<<n) + 1
1660 static int const stbi__jbias[16] = {0,-1,-3,-7,-15,-31,-63,-127,-255,-511,-1023,-2047,-4095,-8191,-16383,-32767};
1661 
1662 // combined JPEG 'receive' and JPEG 'extend', since baseline
1663 // always extends everything it receives.
1664 stbi_inline static int stbi__extend_receive(stbi__jpeg *j, int n)
1665 {
1666  unsigned int k;
1667  int sgn;
1668  if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1669 
1670  sgn = (stbi__int32)j->code_buffer >> 31; // sign bit is always in MSB
1671  k = stbi_lrot(j->code_buffer, n);
1672  STBI_ASSERT(n >= 0 && n < (int) (sizeof(stbi__bmask)/sizeof(*stbi__bmask)));
1673  j->code_buffer = k & ~stbi__bmask[n];
1674  k &= stbi__bmask[n];
1675  j->code_bits -= n;
1676  return k + (stbi__jbias[n] & ~sgn);
1677 }
1678 
1679 // get some unsigned bits
1680 stbi_inline static int stbi__jpeg_get_bits(stbi__jpeg *j, int n)
1681 {
1682  unsigned int k;
1683  if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1684  k = stbi_lrot(j->code_buffer, n);
1685  j->code_buffer = k & ~stbi__bmask[n];
1686  k &= stbi__bmask[n];
1687  j->code_bits -= n;
1688  return k;
1689 }
1690 
1691 stbi_inline static int stbi__jpeg_get_bit(stbi__jpeg *j)
1692 {
1693  unsigned int k;
1694  if (j->code_bits < 1) stbi__grow_buffer_unsafe(j);
1695  k = j->code_buffer;
1696  j->code_buffer <<= 1;
1697  --j->code_bits;
1698  return k & 0x80000000;
1699 }
1700 
1701 // given a value that's at position X in the zigzag stream,
1702 // where does it appear in the 8x8 matrix coded as row-major?
1703 static stbi_uc stbi__jpeg_dezigzag[64+15] =
1704 {
1705  0, 1, 8, 16, 9, 2, 3, 10,
1706  17, 24, 32, 25, 18, 11, 4, 5,
1707  12, 19, 26, 33, 40, 48, 41, 34,
1708  27, 20, 13, 6, 7, 14, 21, 28,
1709  35, 42, 49, 56, 57, 50, 43, 36,
1710  29, 22, 15, 23, 30, 37, 44, 51,
1711  58, 59, 52, 45, 38, 31, 39, 46,
1712  53, 60, 61, 54, 47, 55, 62, 63,
1713  // let corrupt input sample past end
1714  63, 63, 63, 63, 63, 63, 63, 63,
1715  63, 63, 63, 63, 63, 63, 63
1716 };
1717 
1718 // decode one 64-entry block--
1719 static int stbi__jpeg_decode_block(stbi__jpeg *j, short data[64], stbi__huffman *hdc, stbi__huffman *hac, stbi__int16 *fac, int b, stbi_uc *dequant)
1720 {
1721  int diff,dc,k;
1722  int t;
1723 
1724  if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1725  t = stbi__jpeg_huff_decode(j, hdc);
1726  if (t < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1727 
1728  // 0 all the ac values now so we can do it 32-bits at a time
1729  memset(data,0,64*sizeof(data[0]));
1730 
1731  diff = t ? stbi__extend_receive(j, t) : 0;
1732  dc = j->img_comp[b].dc_pred + diff;
1733  j->img_comp[b].dc_pred = dc;
1734  data[0] = (short) (dc * dequant[0]);
1735 
1736  // decode AC components, see JPEG spec
1737  k = 1;
1738  do {
1739  unsigned int zig;
1740  int c,r,s;
1741  if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1742  c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1743  r = fac[c];
1744  if (r) { // fast-AC path
1745  k += (r >> 4) & 15; // run
1746  s = r & 15; // combined length
1747  j->code_buffer <<= s;
1748  j->code_bits -= s;
1749  // decode into unzigzag'd location
1750  zig = stbi__jpeg_dezigzag[k++];
1751  data[zig] = (short) ((r >> 8) * dequant[zig]);
1752  } else {
1753  int rs = stbi__jpeg_huff_decode(j, hac);
1754  if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1755  s = rs & 15;
1756  r = rs >> 4;
1757  if (s == 0) {
1758  if (rs != 0xf0) break; // end block
1759  k += 16;
1760  } else {
1761  k += r;
1762  // decode into unzigzag'd location
1763  zig = stbi__jpeg_dezigzag[k++];
1764  data[zig] = (short) (stbi__extend_receive(j,s) * dequant[zig]);
1765  }
1766  }
1767  } while (k < 64);
1768  return 1;
1769 }
1770 
1771 static int stbi__jpeg_decode_block_prog_dc(stbi__jpeg *j, short data[64], stbi__huffman *hdc, int b)
1772 {
1773  int diff,dc;
1774  int t;
1775  if (j->spec_end != 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1776 
1777  if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1778 
1779  if (j->succ_high == 0) {
1780  // first scan for DC coefficient, must be first
1781  memset(data,0,64*sizeof(data[0])); // 0 all the ac values now
1782  t = stbi__jpeg_huff_decode(j, hdc);
1783  diff = t ? stbi__extend_receive(j, t) : 0;
1784 
1785  dc = j->img_comp[b].dc_pred + diff;
1786  j->img_comp[b].dc_pred = dc;
1787  data[0] = (short) (dc << j->succ_low);
1788  } else {
1789  // refinement scan for DC coefficient
1790  if (stbi__jpeg_get_bit(j))
1791  data[0] += (short) (1 << j->succ_low);
1792  }
1793  return 1;
1794 }
1795 
1796 // @OPTIMIZE: store non-zigzagged during the decode passes,
1797 // and only de-zigzag when dequantizing
1798 static int stbi__jpeg_decode_block_prog_ac(stbi__jpeg *j, short data[64], stbi__huffman *hac, stbi__int16 *fac)
1799 {
1800  int k;
1801  if (j->spec_start == 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1802 
1803  if (j->succ_high == 0) {
1804  int shift = j->succ_low;
1805 
1806  if (j->eob_run) {
1807  --j->eob_run;
1808  return 1;
1809  }
1810 
1811  k = j->spec_start;
1812  do {
1813  unsigned int zig;
1814  int c,r,s;
1815  if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1816  c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1817  r = fac[c];
1818  if (r) { // fast-AC path
1819  k += (r >> 4) & 15; // run
1820  s = r & 15; // combined length
1821  j->code_buffer <<= s;
1822  j->code_bits -= s;
1823  zig = stbi__jpeg_dezigzag[k++];
1824  data[zig] = (short) ((r >> 8) << shift);
1825  } else {
1826  int rs = stbi__jpeg_huff_decode(j, hac);
1827  if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1828  s = rs & 15;
1829  r = rs >> 4;
1830  if (s == 0) {
1831  if (r < 15) {
1832  j->eob_run = (1 << r);
1833  if (r)
1834  j->eob_run += stbi__jpeg_get_bits(j, r);
1835  --j->eob_run;
1836  break;
1837  }
1838  k += 16;
1839  } else {
1840  k += r;
1841  zig = stbi__jpeg_dezigzag[k++];
1842  data[zig] = (short) (stbi__extend_receive(j,s) << shift);
1843  }
1844  }
1845  } while (k <= j->spec_end);
1846  } else {
1847  // refinement scan for these AC coefficients
1848 
1849  short bit = (short) (1 << j->succ_low);
1850 
1851  if (j->eob_run) {
1852  --j->eob_run;
1853  for (k = j->spec_start; k <= j->spec_end; ++k) {
1854  short *p = &data[stbi__jpeg_dezigzag[k]];
1855  if (*p != 0)
1856  if (stbi__jpeg_get_bit(j))
1857  if ((*p & bit)==0) {
1858  if (*p > 0)
1859  *p += bit;
1860  else
1861  *p -= bit;
1862  }
1863  }
1864  } else {
1865  k = j->spec_start;
1866  do {
1867  int r,s;
1868  int rs = stbi__jpeg_huff_decode(j, hac); // @OPTIMIZE see if we can use the fast path here, advance-by-r is so slow, eh
1869  if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1870  s = rs & 15;
1871  r = rs >> 4;
1872  if (s == 0) {
1873  if (r < 15) {
1874  j->eob_run = (1 << r) - 1;
1875  if (r)
1876  j->eob_run += stbi__jpeg_get_bits(j, r);
1877  r = 64; // force end of block
1878  } else {
1879  // r=15 s=0 should write 16 0s, so we just do
1880  // a run of 15 0s and then write s (which is 0),
1881  // so we don't have to do anything special here
1882  }
1883  } else {
1884  if (s != 1) return stbi__err("bad huffman code", "Corrupt JPEG");
1885  // sign bit
1886  if (stbi__jpeg_get_bit(j))
1887  s = bit;
1888  else
1889  s = -bit;
1890  }
1891 
1892  // advance by r
1893  while (k <= j->spec_end) {
1894  short *p = &data[stbi__jpeg_dezigzag[k++]];
1895  if (*p != 0) {
1896  if (stbi__jpeg_get_bit(j))
1897  if ((*p & bit)==0) {
1898  if (*p > 0)
1899  *p += bit;
1900  else
1901  *p -= bit;
1902  }
1903  } else {
1904  if (r == 0) {
1905  *p = (short) s;
1906  break;
1907  }
1908  --r;
1909  }
1910  }
1911  } while (k <= j->spec_end);
1912  }
1913  }
1914  return 1;
1915 }
1916 
1917 // take a -128..127 value and stbi__clamp it and convert to 0..255
1918 stbi_inline static stbi_uc stbi__clamp(int x)
1919 {
1920  // trick to use a single test to catch both cases
1921  if ((unsigned int) x > 255) {
1922  if (x < 0) return 0;
1923  if (x > 255) return 255;
1924  }
1925  return (stbi_uc) x;
1926 }
1927 
1928 #define stbi__f2f(x) ((int) (((x) * 4096 + 0.5)))
1929 #define stbi__fsh(x) ((x) << 12)
1930 
1931 // derived from jidctint -- DCT_ISLOW
1932 #define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7) \
1933  int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
1934  p2 = s2; \
1935  p3 = s6; \
1936  p1 = (p2+p3) * stbi__f2f(0.5411961f); \
1937  t2 = p1 + p3*stbi__f2f(-1.847759065f); \
1938  t3 = p1 + p2*stbi__f2f( 0.765366865f); \
1939  p2 = s0; \
1940  p3 = s4; \
1941  t0 = stbi__fsh(p2+p3); \
1942  t1 = stbi__fsh(p2-p3); \
1943  x0 = t0+t3; \
1944  x3 = t0-t3; \
1945  x1 = t1+t2; \
1946  x2 = t1-t2; \
1947  t0 = s7; \
1948  t1 = s5; \
1949  t2 = s3; \
1950  t3 = s1; \
1951  p3 = t0+t2; \
1952  p4 = t1+t3; \
1953  p1 = t0+t3; \
1954  p2 = t1+t2; \
1955  p5 = (p3+p4)*stbi__f2f( 1.175875602f); \
1956  t0 = t0*stbi__f2f( 0.298631336f); \
1957  t1 = t1*stbi__f2f( 2.053119869f); \
1958  t2 = t2*stbi__f2f( 3.072711026f); \
1959  t3 = t3*stbi__f2f( 1.501321110f); \
1960  p1 = p5 + p1*stbi__f2f(-0.899976223f); \
1961  p2 = p5 + p2*stbi__f2f(-2.562915447f); \
1962  p3 = p3*stbi__f2f(-1.961570560f); \
1963  p4 = p4*stbi__f2f(-0.390180644f); \
1964  t3 += p1+p4; \
1965  t2 += p2+p3; \
1966  t1 += p2+p4; \
1967  t0 += p1+p3;
1968 
1969 static void stbi__idct_block(stbi_uc *out, int out_stride, short data[64])
1970 {
1971  int i,val[64],*v=val;
1972  stbi_uc *o;
1973  short *d = data;
1974 
1975  // columns
1976  for (i=0; i < 8; ++i,++d, ++v) {
1977  // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
1978  if (d[ 8]==0 && d[16]==0 && d[24]==0 && d[32]==0
1979  && d[40]==0 && d[48]==0 && d[56]==0) {
1980  // no shortcut 0 seconds
1981  // (1|2|3|4|5|6|7)==0 0 seconds
1982  // all separate -0.047 seconds
1983  // 1 && 2|3 && 4|5 && 6|7: -0.047 seconds
1984  int dcterm = d[0] << 2;
1985  v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm;
1986  } else {
1987  STBI__IDCT_1D(d[ 0],d[ 8],d[16],d[24],d[32],d[40],d[48],d[56])
1988  // constants scaled things up by 1<<12; let's bring them back
1989  // down, but keep 2 extra bits of precision
1990  x0 += 512; x1 += 512; x2 += 512; x3 += 512;
1991  v[ 0] = (x0+t3) >> 10;
1992  v[56] = (x0-t3) >> 10;
1993  v[ 8] = (x1+t2) >> 10;
1994  v[48] = (x1-t2) >> 10;
1995  v[16] = (x2+t1) >> 10;
1996  v[40] = (x2-t1) >> 10;
1997  v[24] = (x3+t0) >> 10;
1998  v[32] = (x3-t0) >> 10;
1999  }
2000  }
2001 
2002  for (i=0, v=val, o=out; i < 8; ++i,v+=8,o+=out_stride) {
2003  // no fast case since the first 1D IDCT spread components out
2004  STBI__IDCT_1D(v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7])
2005  // constants scaled things up by 1<<12, plus we had 1<<2 from first
2006  // loop, plus horizontal and vertical each scale by sqrt(8) so together
2007  // we've got an extra 1<<3, so 1<<17 total we need to remove.
2008  // so we want to round that, which means adding 0.5 * 1<<17,
2009  // aka 65536. Also, we'll end up with -128 to 127 that we want
2010  // to encode as 0..255 by adding 128, so we'll add that before the shift
2011  x0 += 65536 + (128<<17);
2012  x1 += 65536 + (128<<17);
2013  x2 += 65536 + (128<<17);
2014  x3 += 65536 + (128<<17);
2015  // tried computing the shifts into temps, or'ing the temps to see
2016  // if any were out of range, but that was slower
2017  o[0] = stbi__clamp((x0+t3) >> 17);
2018  o[7] = stbi__clamp((x0-t3) >> 17);
2019  o[1] = stbi__clamp((x1+t2) >> 17);
2020  o[6] = stbi__clamp((x1-t2) >> 17);
2021  o[2] = stbi__clamp((x2+t1) >> 17);
2022  o[5] = stbi__clamp((x2-t1) >> 17);
2023  o[3] = stbi__clamp((x3+t0) >> 17);
2024  o[4] = stbi__clamp((x3-t0) >> 17);
2025  }
2026 }
2027 
2028 #ifdef STBI_SSE2
2029 // sse2 integer IDCT. not the fastest possible implementation but it
2030 // produces bit-identical results to the generic C version so it's
2031 // fully "transparent".
2032 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2033 {
2034  // This is constructed to match our regular (generic) integer IDCT exactly.
2035  __m128i row0, row1, row2, row3, row4, row5, row6, row7;
2036  __m128i tmp;
2037 
2038  // dot product constant: even elems=x, odd elems=y
2039  #define dct_const(x,y) _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y))
2040 
2041  // out(0) = c0[even]*x + c0[odd]*y (c0, x, y 16-bit, out 32-bit)
2042  // out(1) = c1[even]*x + c1[odd]*y
2043  #define dct_rot(out0,out1, x,y,c0,c1) \
2044  __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \
2045  __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \
2046  __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \
2047  __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \
2048  __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \
2049  __m128i out1##_h = _mm_madd_epi16(c0##hi, c1)
2050 
2051  // out = in << 12 (in 16-bit, out 32-bit)
2052  #define dct_widen(out, in) \
2053  __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \
2054  __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4)
2055 
2056  // wide add
2057  #define dct_wadd(out, a, b) \
2058  __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \
2059  __m128i out##_h = _mm_add_epi32(a##_h, b##_h)
2060 
2061  // wide sub
2062  #define dct_wsub(out, a, b) \
2063  __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \
2064  __m128i out##_h = _mm_sub_epi32(a##_h, b##_h)
2065 
2066  // butterfly a/b, add bias, then shift by "s" and pack
2067  #define dct_bfly32o(out0, out1, a,b,bias,s) \
2068  { \
2069  __m128i abiased_l = _mm_add_epi32(a##_l, bias); \
2070  __m128i abiased_h = _mm_add_epi32(a##_h, bias); \
2071  dct_wadd(sum, abiased, b); \
2072  dct_wsub(dif, abiased, b); \
2073  out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \
2074  out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \
2075  }
2076 
2077  // 8-bit interleave step (for transposes)
2078  #define dct_interleave8(a, b) \
2079  tmp = a; \
2080  a = _mm_unpacklo_epi8(a, b); \
2081  b = _mm_unpackhi_epi8(tmp, b)
2082 
2083  // 16-bit interleave step (for transposes)
2084  #define dct_interleave16(a, b) \
2085  tmp = a; \
2086  a = _mm_unpacklo_epi16(a, b); \
2087  b = _mm_unpackhi_epi16(tmp, b)
2088 
2089  #define dct_pass(bias,shift) \
2090  { \
2091  /* even part */ \
2092  dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \
2093  __m128i sum04 = _mm_add_epi16(row0, row4); \
2094  __m128i dif04 = _mm_sub_epi16(row0, row4); \
2095  dct_widen(t0e, sum04); \
2096  dct_widen(t1e, dif04); \
2097  dct_wadd(x0, t0e, t3e); \
2098  dct_wsub(x3, t0e, t3e); \
2099  dct_wadd(x1, t1e, t2e); \
2100  dct_wsub(x2, t1e, t2e); \
2101  /* odd part */ \
2102  dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \
2103  dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \
2104  __m128i sum17 = _mm_add_epi16(row1, row7); \
2105  __m128i sum35 = _mm_add_epi16(row3, row5); \
2106  dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \
2107  dct_wadd(x4, y0o, y4o); \
2108  dct_wadd(x5, y1o, y5o); \
2109  dct_wadd(x6, y2o, y5o); \
2110  dct_wadd(x7, y3o, y4o); \
2111  dct_bfly32o(row0,row7, x0,x7,bias,shift); \
2112  dct_bfly32o(row1,row6, x1,x6,bias,shift); \
2113  dct_bfly32o(row2,row5, x2,x5,bias,shift); \
2114  dct_bfly32o(row3,row4, x3,x4,bias,shift); \
2115  }
2116 
2117  __m128i rot0_0 = dct_const(stbi__f2f(0.5411961f), stbi__f2f(0.5411961f) + stbi__f2f(-1.847759065f));
2118  __m128i rot0_1 = dct_const(stbi__f2f(0.5411961f) + stbi__f2f( 0.765366865f), stbi__f2f(0.5411961f));
2119  __m128i rot1_0 = dct_const(stbi__f2f(1.175875602f) + stbi__f2f(-0.899976223f), stbi__f2f(1.175875602f));
2120  __m128i rot1_1 = dct_const(stbi__f2f(1.175875602f), stbi__f2f(1.175875602f) + stbi__f2f(-2.562915447f));
2121  __m128i rot2_0 = dct_const(stbi__f2f(-1.961570560f) + stbi__f2f( 0.298631336f), stbi__f2f(-1.961570560f));
2122  __m128i rot2_1 = dct_const(stbi__f2f(-1.961570560f), stbi__f2f(-1.961570560f) + stbi__f2f( 3.072711026f));
2123  __m128i rot3_0 = dct_const(stbi__f2f(-0.390180644f) + stbi__f2f( 2.053119869f), stbi__f2f(-0.390180644f));
2124  __m128i rot3_1 = dct_const(stbi__f2f(-0.390180644f), stbi__f2f(-0.390180644f) + stbi__f2f( 1.501321110f));
2125 
2126  // rounding biases in column/row passes, see stbi__idct_block for explanation.
2127  __m128i bias_0 = _mm_set1_epi32(512);
2128  __m128i bias_1 = _mm_set1_epi32(65536 + (128<<17));
2129 
2130  // load
2131  row0 = _mm_load_si128((const __m128i *) (data + 0*8));
2132  row1 = _mm_load_si128((const __m128i *) (data + 1*8));
2133  row2 = _mm_load_si128((const __m128i *) (data + 2*8));
2134  row3 = _mm_load_si128((const __m128i *) (data + 3*8));
2135  row4 = _mm_load_si128((const __m128i *) (data + 4*8));
2136  row5 = _mm_load_si128((const __m128i *) (data + 5*8));
2137  row6 = _mm_load_si128((const __m128i *) (data + 6*8));
2138  row7 = _mm_load_si128((const __m128i *) (data + 7*8));
2139 
2140  // column pass
2141  dct_pass(bias_0, 10);
2142 
2143  {
2144  // 16bit 8x8 transpose pass 1
2145  dct_interleave16(row0, row4);
2146  dct_interleave16(row1, row5);
2147  dct_interleave16(row2, row6);
2148  dct_interleave16(row3, row7);
2149 
2150  // transpose pass 2
2151  dct_interleave16(row0, row2);
2152  dct_interleave16(row1, row3);
2153  dct_interleave16(row4, row6);
2154  dct_interleave16(row5, row7);
2155 
2156  // transpose pass 3
2157  dct_interleave16(row0, row1);
2158  dct_interleave16(row2, row3);
2159  dct_interleave16(row4, row5);
2160  dct_interleave16(row6, row7);
2161  }
2162 
2163  // row pass
2164  dct_pass(bias_1, 17);
2165 
2166  {
2167  // pack
2168  __m128i p0 = _mm_packus_epi16(row0, row1); // a0a1a2a3...a7b0b1b2b3...b7
2169  __m128i p1 = _mm_packus_epi16(row2, row3);
2170  __m128i p2 = _mm_packus_epi16(row4, row5);
2171  __m128i p3 = _mm_packus_epi16(row6, row7);
2172 
2173  // 8bit 8x8 transpose pass 1
2174  dct_interleave8(p0, p2); // a0e0a1e1...
2175  dct_interleave8(p1, p3); // c0g0c1g1...
2176 
2177  // transpose pass 2
2178  dct_interleave8(p0, p1); // a0c0e0g0...
2179  dct_interleave8(p2, p3); // b0d0f0h0...
2180 
2181  // transpose pass 3
2182  dct_interleave8(p0, p2); // a0b0c0d0...
2183  dct_interleave8(p1, p3); // a4b4c4d4...
2184 
2185  // store
2186  _mm_storel_epi64((__m128i *) out, p0); out += out_stride;
2187  _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p0, 0x4e)); out += out_stride;
2188  _mm_storel_epi64((__m128i *) out, p2); out += out_stride;
2189  _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p2, 0x4e)); out += out_stride;
2190  _mm_storel_epi64((__m128i *) out, p1); out += out_stride;
2191  _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p1, 0x4e)); out += out_stride;
2192  _mm_storel_epi64((__m128i *) out, p3); out += out_stride;
2193  _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p3, 0x4e));
2194  }
2195 
2196 #undef dct_const
2197 #undef dct_rot
2198 #undef dct_widen
2199 #undef dct_wadd
2200 #undef dct_wsub
2201 #undef dct_bfly32o
2202 #undef dct_interleave8
2203 #undef dct_interleave16
2204 #undef dct_pass
2205 }
2206 
2207 #endif // STBI_SSE2
2208 
2209 #ifdef STBI_NEON
2210 
2211 // NEON integer IDCT. should produce bit-identical
2212 // results to the generic C version.
2213 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2214 {
2215  int16x8_t row0, row1, row2, row3, row4, row5, row6, row7;
2216 
2217  int16x4_t rot0_0 = vdup_n_s16(stbi__f2f(0.5411961f));
2218  int16x4_t rot0_1 = vdup_n_s16(stbi__f2f(-1.847759065f));
2219  int16x4_t rot0_2 = vdup_n_s16(stbi__f2f( 0.765366865f));
2220  int16x4_t rot1_0 = vdup_n_s16(stbi__f2f( 1.175875602f));
2221  int16x4_t rot1_1 = vdup_n_s16(stbi__f2f(-0.899976223f));
2222  int16x4_t rot1_2 = vdup_n_s16(stbi__f2f(-2.562915447f));
2223  int16x4_t rot2_0 = vdup_n_s16(stbi__f2f(-1.961570560f));
2224  int16x4_t rot2_1 = vdup_n_s16(stbi__f2f(-0.390180644f));
2225  int16x4_t rot3_0 = vdup_n_s16(stbi__f2f( 0.298631336f));
2226  int16x4_t rot3_1 = vdup_n_s16(stbi__f2f( 2.053119869f));
2227  int16x4_t rot3_2 = vdup_n_s16(stbi__f2f( 3.072711026f));
2228  int16x4_t rot3_3 = vdup_n_s16(stbi__f2f( 1.501321110f));
2229 
2230 #define dct_long_mul(out, inq, coeff) \
2231  int32x4_t out##_l = vmull_s16(vget_low_s16(inq), coeff); \
2232  int32x4_t out##_h = vmull_s16(vget_high_s16(inq), coeff)
2233 
2234 #define dct_long_mac(out, acc, inq, coeff) \
2235  int32x4_t out##_l = vmlal_s16(acc##_l, vget_low_s16(inq), coeff); \
2236  int32x4_t out##_h = vmlal_s16(acc##_h, vget_high_s16(inq), coeff)
2237 
2238 #define dct_widen(out, inq) \
2239  int32x4_t out##_l = vshll_n_s16(vget_low_s16(inq), 12); \
2240  int32x4_t out##_h = vshll_n_s16(vget_high_s16(inq), 12)
2241 
2242 // wide add
2243 #define dct_wadd(out, a, b) \
2244  int32x4_t out##_l = vaddq_s32(a##_l, b##_l); \
2245  int32x4_t out##_h = vaddq_s32(a##_h, b##_h)
2246 
2247 // wide sub
2248 #define dct_wsub(out, a, b) \
2249  int32x4_t out##_l = vsubq_s32(a##_l, b##_l); \
2250  int32x4_t out##_h = vsubq_s32(a##_h, b##_h)
2251 
2252 // butterfly a/b, then shift using "shiftop" by "s" and pack
2253 #define dct_bfly32o(out0,out1, a,b,shiftop,s) \
2254  { \
2255  dct_wadd(sum, a, b); \
2256  dct_wsub(dif, a, b); \
2257  out0 = vcombine_s16(shiftop(sum_l, s), shiftop(sum_h, s)); \
2258  out1 = vcombine_s16(shiftop(dif_l, s), shiftop(dif_h, s)); \
2259  }
2260 
2261 #define dct_pass(shiftop, shift) \
2262  { \
2263  /* even part */ \
2264  int16x8_t sum26 = vaddq_s16(row2, row6); \
2265  dct_long_mul(p1e, sum26, rot0_0); \
2266  dct_long_mac(t2e, p1e, row6, rot0_1); \
2267  dct_long_mac(t3e, p1e, row2, rot0_2); \
2268  int16x8_t sum04 = vaddq_s16(row0, row4); \
2269  int16x8_t dif04 = vsubq_s16(row0, row4); \
2270  dct_widen(t0e, sum04); \
2271  dct_widen(t1e, dif04); \
2272  dct_wadd(x0, t0e, t3e); \
2273  dct_wsub(x3, t0e, t3e); \
2274  dct_wadd(x1, t1e, t2e); \
2275  dct_wsub(x2, t1e, t2e); \
2276  /* odd part */ \
2277  int16x8_t sum15 = vaddq_s16(row1, row5); \
2278  int16x8_t sum17 = vaddq_s16(row1, row7); \
2279  int16x8_t sum35 = vaddq_s16(row3, row5); \
2280  int16x8_t sum37 = vaddq_s16(row3, row7); \
2281  int16x8_t sumodd = vaddq_s16(sum17, sum35); \
2282  dct_long_mul(p5o, sumodd, rot1_0); \
2283  dct_long_mac(p1o, p5o, sum17, rot1_1); \
2284  dct_long_mac(p2o, p5o, sum35, rot1_2); \
2285  dct_long_mul(p3o, sum37, rot2_0); \
2286  dct_long_mul(p4o, sum15, rot2_1); \
2287  dct_wadd(sump13o, p1o, p3o); \
2288  dct_wadd(sump24o, p2o, p4o); \
2289  dct_wadd(sump23o, p2o, p3o); \
2290  dct_wadd(sump14o, p1o, p4o); \
2291  dct_long_mac(x4, sump13o, row7, rot3_0); \
2292  dct_long_mac(x5, sump24o, row5, rot3_1); \
2293  dct_long_mac(x6, sump23o, row3, rot3_2); \
2294  dct_long_mac(x7, sump14o, row1, rot3_3); \
2295  dct_bfly32o(row0,row7, x0,x7,shiftop,shift); \
2296  dct_bfly32o(row1,row6, x1,x6,shiftop,shift); \
2297  dct_bfly32o(row2,row5, x2,x5,shiftop,shift); \
2298  dct_bfly32o(row3,row4, x3,x4,shiftop,shift); \
2299  }
2300 
2301  // load
2302  row0 = vld1q_s16(data + 0*8);
2303  row1 = vld1q_s16(data + 1*8);
2304  row2 = vld1q_s16(data + 2*8);
2305  row3 = vld1q_s16(data + 3*8);
2306  row4 = vld1q_s16(data + 4*8);
2307  row5 = vld1q_s16(data + 5*8);
2308  row6 = vld1q_s16(data + 6*8);
2309  row7 = vld1q_s16(data + 7*8);
2310 
2311  // add DC bias
2312  row0 = vaddq_s16(row0, vsetq_lane_s16(1024, vdupq_n_s16(0), 0));
2313 
2314  // column pass
2315  dct_pass(vrshrn_n_s32, 10);
2316 
2317  // 16bit 8x8 transpose
2318  {
2319 // these three map to a single VTRN.16, VTRN.32, and VSWP, respectively.
2320 // whether compilers actually get this is another story, sadly.
2321 #define dct_trn16(x, y) { int16x8x2_t t = vtrnq_s16(x, y); x = t.val[0]; y = t.val[1]; }
2322 #define dct_trn32(x, y) { int32x4x2_t t = vtrnq_s32(vreinterpretq_s32_s16(x), vreinterpretq_s32_s16(y)); x = vreinterpretq_s16_s32(t.val[0]); y = vreinterpretq_s16_s32(t.val[1]); }
2323 #define dct_trn64(x, y) { int16x8_t x0 = x; int16x8_t y0 = y; x = vcombine_s16(vget_low_s16(x0), vget_low_s16(y0)); y = vcombine_s16(vget_high_s16(x0), vget_high_s16(y0)); }
2324 
2325  // pass 1
2326  dct_trn16(row0, row1); // a0b0a2b2a4b4a6b6
2327  dct_trn16(row2, row3);
2328  dct_trn16(row4, row5);
2329  dct_trn16(row6, row7);
2330 
2331  // pass 2
2332  dct_trn32(row0, row2); // a0b0c0d0a4b4c4d4
2333  dct_trn32(row1, row3);
2334  dct_trn32(row4, row6);
2335  dct_trn32(row5, row7);
2336 
2337  // pass 3
2338  dct_trn64(row0, row4); // a0b0c0d0e0f0g0h0
2339  dct_trn64(row1, row5);
2340  dct_trn64(row2, row6);
2341  dct_trn64(row3, row7);
2342 
2343 #undef dct_trn16
2344 #undef dct_trn32
2345 #undef dct_trn64
2346  }
2347 
2348  // row pass
2349  // vrshrn_n_s32 only supports shifts up to 16, we need
2350  // 17. so do a non-rounding shift of 16 first then follow
2351  // up with a rounding shift by 1.
2352  dct_pass(vshrn_n_s32, 16);
2353 
2354  {
2355  // pack and round
2356  uint8x8_t p0 = vqrshrun_n_s16(row0, 1);
2357  uint8x8_t p1 = vqrshrun_n_s16(row1, 1);
2358  uint8x8_t p2 = vqrshrun_n_s16(row2, 1);
2359  uint8x8_t p3 = vqrshrun_n_s16(row3, 1);
2360  uint8x8_t p4 = vqrshrun_n_s16(row4, 1);
2361  uint8x8_t p5 = vqrshrun_n_s16(row5, 1);
2362  uint8x8_t p6 = vqrshrun_n_s16(row6, 1);
2363  uint8x8_t p7 = vqrshrun_n_s16(row7, 1);
2364 
2365  // again, these can translate into one instruction, but often don't.
2366 #define dct_trn8_8(x, y) { uint8x8x2_t t = vtrn_u8(x, y); x = t.val[0]; y = t.val[1]; }
2367 #define dct_trn8_16(x, y) { uint16x4x2_t t = vtrn_u16(vreinterpret_u16_u8(x), vreinterpret_u16_u8(y)); x = vreinterpret_u8_u16(t.val[0]); y = vreinterpret_u8_u16(t.val[1]); }
2368 #define dct_trn8_32(x, y) { uint32x2x2_t t = vtrn_u32(vreinterpret_u32_u8(x), vreinterpret_u32_u8(y)); x = vreinterpret_u8_u32(t.val[0]); y = vreinterpret_u8_u32(t.val[1]); }
2369 
2370  // sadly can't use interleaved stores here since we only write
2371  // 8 bytes to each scan line!
2372 
2373  // 8x8 8-bit transpose pass 1
2374  dct_trn8_8(p0, p1);
2375  dct_trn8_8(p2, p3);
2376  dct_trn8_8(p4, p5);
2377  dct_trn8_8(p6, p7);
2378 
2379  // pass 2
2380  dct_trn8_16(p0, p2);
2381  dct_trn8_16(p1, p3);
2382  dct_trn8_16(p4, p6);
2383  dct_trn8_16(p5, p7);
2384 
2385  // pass 3
2386  dct_trn8_32(p0, p4);
2387  dct_trn8_32(p1, p5);
2388  dct_trn8_32(p2, p6);
2389  dct_trn8_32(p3, p7);
2390 
2391  // store
2392  vst1_u8(out, p0); out += out_stride;
2393  vst1_u8(out, p1); out += out_stride;
2394  vst1_u8(out, p2); out += out_stride;
2395  vst1_u8(out, p3); out += out_stride;
2396  vst1_u8(out, p4); out += out_stride;
2397  vst1_u8(out, p5); out += out_stride;
2398  vst1_u8(out, p6); out += out_stride;
2399  vst1_u8(out, p7);
2400 
2401 #undef dct_trn8_8
2402 #undef dct_trn8_16
2403 #undef dct_trn8_32
2404  }
2405 
2406 #undef dct_long_mul
2407 #undef dct_long_mac
2408 #undef dct_widen
2409 #undef dct_wadd
2410 #undef dct_wsub
2411 #undef dct_bfly32o
2412 #undef dct_pass
2413 }
2414 
2415 #endif // STBI_NEON
2416 
2417 #define STBI__MARKER_none 0xff
2418 // if there's a pending marker from the entropy stream, return that
2419 // otherwise, fetch from the stream and get a marker. if there's no
2420 // marker, return 0xff, which is never a valid marker value
2421 static stbi_uc stbi__get_marker(stbi__jpeg *j)
2422 {
2423  stbi_uc x;
2424  if (j->marker != STBI__MARKER_none) { x = j->marker; j->marker = STBI__MARKER_none; return x; }
2425  x = stbi__get8(j->s);
2426  if (x != 0xff) return STBI__MARKER_none;
2427  while (x == 0xff)
2428  x = stbi__get8(j->s);
2429  return x;
2430 }
2431 
2432 // in each scan, we'll have scan_n components, and the order
2433 // of the components is specified by order[]
2434 #define STBI__RESTART(x) ((x) >= 0xd0 && (x) <= 0xd7)
2435 
2436 // after a restart interval, stbi__jpeg_reset the entropy decoder and
2437 // the dc prediction
2438 static void stbi__jpeg_reset(stbi__jpeg *j)
2439 {
2440  j->code_bits = 0;
2441  j->code_buffer = 0;
2442  j->nomore = 0;
2443  j->img_comp[0].dc_pred = j->img_comp[1].dc_pred = j->img_comp[2].dc_pred = 0;
2444  j->marker = STBI__MARKER_none;
2445  j->todo = j->restart_interval ? j->restart_interval : 0x7fffffff;
2446  j->eob_run = 0;
2447  // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
2448  // since we don't even allow 1<<30 pixels
2449 }
2450 
2451 static int stbi__parse_entropy_coded_data(stbi__jpeg *z)
2452 {
2453  stbi__jpeg_reset(z);
2454  if (!z->progressive) {
2455  if (z->scan_n == 1) {
2456  int i,j;
2457  STBI_SIMD_ALIGN(short, data[64]);
2458  int n = z->order[0];
2459  // non-interleaved data, we just need to process one block at a time,
2460  // in trivial scanline order
2461  // number of blocks to do just depends on how many actual "pixels" this
2462  // component has, independent of interleaved MCU blocking and such
2463  int w = (z->img_comp[n].x+7) >> 3;
2464  int h = (z->img_comp[n].y+7) >> 3;
2465  for (j=0; j < h; ++j) {
2466  for (i=0; i < w; ++i) {
2467  int ha = z->img_comp[n].ha;
2468  if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2469  z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2470  // every data block is an MCU, so countdown the restart interval
2471  if (--z->todo <= 0) {
2472  if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2473  // if it's NOT a restart, then just bail, so we get corrupt data
2474  // rather than no data
2475  if (!STBI__RESTART(z->marker)) return 1;
2476  stbi__jpeg_reset(z);
2477  }
2478  }
2479  }
2480  return 1;
2481  } else { // interleaved
2482  int i,j,k,x,y;
2483  STBI_SIMD_ALIGN(short, data[64]);
2484  for (j=0; j < z->img_mcu_y; ++j) {
2485  for (i=0; i < z->img_mcu_x; ++i) {
2486  // scan an interleaved mcu... process scan_n components in order
2487  for (k=0; k < z->scan_n; ++k) {
2488  int n = z->order[k];
2489  // scan out an mcu's worth of this component; that's just determined
2490  // by the basic H and V specified for the component
2491  for (y=0; y < z->img_comp[n].v; ++y) {
2492  for (x=0; x < z->img_comp[n].h; ++x) {
2493  int x2 = (i*z->img_comp[n].h + x)*8;
2494  int y2 = (j*z->img_comp[n].v + y)*8;
2495  int ha = z->img_comp[n].ha;
2496  if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2497  z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data);
2498  }
2499  }
2500  }
2501  // after all interleaved components, that's an interleaved MCU,
2502  // so now count down the restart interval
2503  if (--z->todo <= 0) {
2504  if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2505  if (!STBI__RESTART(z->marker)) return 1;
2506  stbi__jpeg_reset(z);
2507  }
2508  }
2509  }
2510  return 1;
2511  }
2512  } else {
2513  if (z->scan_n == 1) {
2514  int i,j;
2515  int n = z->order[0];
2516  // non-interleaved data, we just need to process one block at a time,
2517  // in trivial scanline order
2518  // number of blocks to do just depends on how many actual "pixels" this
2519  // component has, independent of interleaved MCU blocking and such
2520  int w = (z->img_comp[n].x+7) >> 3;
2521  int h = (z->img_comp[n].y+7) >> 3;
2522  for (j=0; j < h; ++j) {
2523  for (i=0; i < w; ++i) {
2524  short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2525  if (z->spec_start == 0) {
2526  if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2527  return 0;
2528  } else {
2529  int ha = z->img_comp[n].ha;
2530  if (!stbi__jpeg_decode_block_prog_ac(z, data, &z->huff_ac[ha], z->fast_ac[ha]))
2531  return 0;
2532  }
2533  // every data block is an MCU, so countdown the restart interval
2534  if (--z->todo <= 0) {
2535  if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2536  if (!STBI__RESTART(z->marker)) return 1;
2537  stbi__jpeg_reset(z);
2538  }
2539  }
2540  }
2541  return 1;
2542  } else { // interleaved
2543  int i,j,k,x,y;
2544  for (j=0; j < z->img_mcu_y; ++j) {
2545  for (i=0; i < z->img_mcu_x; ++i) {
2546  // scan an interleaved mcu... process scan_n components in order
2547  for (k=0; k < z->scan_n; ++k) {
2548  int n = z->order[k];
2549  // scan out an mcu's worth of this component; that's just determined
2550  // by the basic H and V specified for the component
2551  for (y=0; y < z->img_comp[n].v; ++y) {
2552  for (x=0; x < z->img_comp[n].h; ++x) {
2553  int x2 = (i*z->img_comp[n].h + x);
2554  int y2 = (j*z->img_comp[n].v + y);
2555  short *data = z->img_comp[n].coeff + 64 * (x2 + y2 * z->img_comp[n].coeff_w);
2556  if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2557  return 0;
2558  }
2559  }
2560  }
2561  // after all interleaved components, that's an interleaved MCU,
2562  // so now count down the restart interval
2563  if (--z->todo <= 0) {
2564  if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2565  if (!STBI__RESTART(z->marker)) return 1;
2566  stbi__jpeg_reset(z);
2567  }
2568  }
2569  }
2570  return 1;
2571  }
2572  }
2573 }
2574 
2575 static void stbi__jpeg_dequantize(short *data, stbi_uc *dequant)
2576 {
2577  int i;
2578  for (i=0; i < 64; ++i)
2579  data[i] *= dequant[i];
2580 }
2581 
2582 static void stbi__jpeg_finish(stbi__jpeg *z)
2583 {
2584  if (z->progressive) {
2585  // dequantize and idct the data
2586  int i,j,n;
2587  for (n=0; n < z->s->img_n; ++n) {
2588  int w = (z->img_comp[n].x+7) >> 3;
2589  int h = (z->img_comp[n].y+7) >> 3;
2590  for (j=0; j < h; ++j) {
2591  for (i=0; i < w; ++i) {
2592  short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2593  stbi__jpeg_dequantize(data, z->dequant[z->img_comp[n].tq]);
2594  z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2595  }
2596  }
2597  }
2598  }
2599 }
2600 
2601 static int stbi__process_marker(stbi__jpeg *z, int m)
2602 {
2603  int L;
2604  switch (m) {
2605  case STBI__MARKER_none: // no marker found
2606  return stbi__err("expected marker","Corrupt JPEG");
2607 
2608  case 0xDD: // DRI - specify restart interval
2609  if (stbi__get16be(z->s) != 4) return stbi__err("bad DRI len","Corrupt JPEG");
2610  z->restart_interval = stbi__get16be(z->s);
2611  return 1;
2612 
2613  case 0xDB: // DQT - define quantization table
2614  L = stbi__get16be(z->s)-2;
2615  while (L > 0) {
2616  int q = stbi__get8(z->s);
2617  int p = q >> 4;
2618  int t = q & 15,i;
2619  if (p != 0) return stbi__err("bad DQT type","Corrupt JPEG");
2620  if (t > 3) return stbi__err("bad DQT table","Corrupt JPEG");
2621  for (i=0; i < 64; ++i)
2622  z->dequant[t][stbi__jpeg_dezigzag[i]] = stbi__get8(z->s);
2623  L -= 65;
2624  }
2625  return L==0;
2626 
2627  case 0xC4: // DHT - define huffman table
2628  L = stbi__get16be(z->s)-2;
2629  while (L > 0) {
2630  stbi_uc *v;
2631  int sizes[16],i,n=0;
2632  int q = stbi__get8(z->s);
2633  int tc = q >> 4;
2634  int th = q & 15;
2635  if (tc > 1 || th > 3) return stbi__err("bad DHT header","Corrupt JPEG");
2636  for (i=0; i < 16; ++i) {
2637  sizes[i] = stbi__get8(z->s);
2638  n += sizes[i];
2639  }
2640  L -= 17;
2641  if (tc == 0) {
2642  if (!stbi__build_huffman(z->huff_dc+th, sizes)) return 0;
2643  v = z->huff_dc[th].values;
2644  } else {
2645  if (!stbi__build_huffman(z->huff_ac+th, sizes)) return 0;
2646  v = z->huff_ac[th].values;
2647  }
2648  for (i=0; i < n; ++i)
2649  v[i] = stbi__get8(z->s);
2650  if (tc != 0)
2651  stbi__build_fast_ac(z->fast_ac[th], z->huff_ac + th);
2652  L -= n;
2653  }
2654  return L==0;
2655  }
2656  // check for comment block or APP blocks
2657  if ((m >= 0xE0 && m <= 0xEF) || m == 0xFE) {
2658  stbi__skip(z->s, stbi__get16be(z->s)-2);
2659  return 1;
2660  }
2661  return 0;
2662 }
2663 
2664 // after we see SOS
2665 static int stbi__process_scan_header(stbi__jpeg *z)
2666 {
2667  int i;
2668  int Ls = stbi__get16be(z->s);
2669  z->scan_n = stbi__get8(z->s);
2670  if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s->img_n) return stbi__err("bad SOS component count","Corrupt JPEG");
2671  if (Ls != 6+2*z->scan_n) return stbi__err("bad SOS len","Corrupt JPEG");
2672  for (i=0; i < z->scan_n; ++i) {
2673  int id = stbi__get8(z->s), which;
2674  int q = stbi__get8(z->s);
2675  for (which = 0; which < z->s->img_n; ++which)
2676  if (z->img_comp[which].id == id)
2677  break;
2678  if (which == z->s->img_n) return 0; // no match
2679  z->img_comp[which].hd = q >> 4; if (z->img_comp[which].hd > 3) return stbi__err("bad DC huff","Corrupt JPEG");
2680  z->img_comp[which].ha = q & 15; if (z->img_comp[which].ha > 3) return stbi__err("bad AC huff","Corrupt JPEG");
2681  z->order[i] = which;
2682  }
2683 
2684  {
2685  int aa;
2686  z->spec_start = stbi__get8(z->s);
2687  z->spec_end = stbi__get8(z->s); // should be 63, but might be 0
2688  aa = stbi__get8(z->s);
2689  z->succ_high = (aa >> 4);
2690  z->succ_low = (aa & 15);
2691  if (z->progressive) {
2692  if (z->spec_start > 63 || z->spec_end > 63 || z->spec_start > z->spec_end || z->succ_high > 13 || z->succ_low > 13)
2693  return stbi__err("bad SOS", "Corrupt JPEG");
2694  } else {
2695  if (z->spec_start != 0) return stbi__err("bad SOS","Corrupt JPEG");
2696  if (z->succ_high != 0 || z->succ_low != 0) return stbi__err("bad SOS","Corrupt JPEG");
2697  z->spec_end = 63;
2698  }
2699  }
2700 
2701  return 1;
2702 }
2703 
2704 static int stbi__process_frame_header(stbi__jpeg *z, int scan)
2705 {
2706  stbi__context *s = z->s;
2707  int Lf,p,i,q, h_max=1,v_max=1,c;
2708  Lf = stbi__get16be(s); if (Lf < 11) return stbi__err("bad SOF len","Corrupt JPEG"); // JPEG
2709  p = stbi__get8(s); if (p != 8) return stbi__err("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline
2710  s->img_y = stbi__get16be(s); if (s->img_y == 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
2711  s->img_x = stbi__get16be(s); if (s->img_x == 0) return stbi__err("0 width","Corrupt JPEG"); // JPEG requires
2712  c = stbi__get8(s);
2713  if (c != 3 && c != 1) return stbi__err("bad component count","Corrupt JPEG"); // JFIF requires
2714  s->img_n = c;
2715  for (i=0; i < c; ++i) {
2716  z->img_comp[i].data = NULL;
2717  z->img_comp[i].linebuf = NULL;
2718  }
2719 
2720  if (Lf != 8+3*s->img_n) return stbi__err("bad SOF len","Corrupt JPEG");
2721 
2722  z->rgb = 0;
2723  for (i=0; i < s->img_n; ++i) {
2724  static unsigned char rgb[3] = { 'R', 'G', 'B' };
2725  z->img_comp[i].id = stbi__get8(s);
2726  if (z->img_comp[i].id != i+1) // JFIF requires
2727  if (z->img_comp[i].id != i) { // some version of jpegtran outputs non-JFIF-compliant files!
2728  // somethings output this (see http://fileformats.archiveteam.org/wiki/JPEG#Color_format)
2729  if (z->img_comp[i].id != rgb[i])
2730  return stbi__err("bad component ID","Corrupt JPEG");
2731  ++z->rgb;
2732  }
2733  q = stbi__get8(s);
2734  z->img_comp[i].h = (q >> 4); if (!z->img_comp[i].h || z->img_comp[i].h > 4) return stbi__err("bad H","Corrupt JPEG");
2735  z->img_comp[i].v = q & 15; if (!z->img_comp[i].v || z->img_comp[i].v > 4) return stbi__err("bad V","Corrupt JPEG");
2736  z->img_comp[i].tq = stbi__get8(s); if (z->img_comp[i].tq > 3) return stbi__err("bad TQ","Corrupt JPEG");
2737  }
2738 
2739  if (scan != STBI__SCAN_load) return 1;
2740 
2741  if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
2742 
2743  for (i=0; i < s->img_n; ++i) {
2744  if (z->img_comp[i].h > h_max) h_max = z->img_comp[i].h;
2745  if (z->img_comp[i].v > v_max) v_max = z->img_comp[i].v;
2746  }
2747 
2748  // compute interleaved mcu info
2749  z->img_h_max = h_max;
2750  z->img_v_max = v_max;
2751  z->img_mcu_w = h_max * 8;
2752  z->img_mcu_h = v_max * 8;
2753  z->img_mcu_x = (s->img_x + z->img_mcu_w-1) / z->img_mcu_w;
2754  z->img_mcu_y = (s->img_y + z->img_mcu_h-1) / z->img_mcu_h;
2755 
2756  for (i=0; i < s->img_n; ++i) {
2757  // number of effective pixels (e.g. for non-interleaved MCU)
2758  z->img_comp[i].x = (s->img_x * z->img_comp[i].h + h_max-1) / h_max;
2759  z->img_comp[i].y = (s->img_y * z->img_comp[i].v + v_max-1) / v_max;
2760  // to simplify generation, we'll allocate enough memory to decode
2761  // the bogus oversized data from using interleaved MCUs and their
2762  // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
2763  // discard the extra data until colorspace conversion
2764  z->img_comp[i].w2 = z->img_mcu_x * z->img_comp[i].h * 8;
2765  z->img_comp[i].h2 = z->img_mcu_y * z->img_comp[i].v * 8;
2766  z->img_comp[i].raw_data = stbi__malloc(z->img_comp[i].w2 * z->img_comp[i].h2+15);
2767 
2768  if (z->img_comp[i].raw_data == NULL) {
2769  for(--i; i >= 0; --i) {
2770  STBI_FREE(z->img_comp[i].raw_data);
2771  z->img_comp[i].raw_data = NULL;
2772  }
2773  return stbi__err("outofmem", "Out of memory");
2774  }
2775  // align blocks for idct using mmx/sse
2776  z->img_comp[i].data = (stbi_uc*) (((size_t) z->img_comp[i].raw_data + 15) & ~15);
2777  z->img_comp[i].linebuf = NULL;
2778  if (z->progressive) {
2779  z->img_comp[i].coeff_w = (z->img_comp[i].w2 + 7) >> 3;
2780  z->img_comp[i].coeff_h = (z->img_comp[i].h2 + 7) >> 3;
2781  z->img_comp[i].raw_coeff = STBI_MALLOC(z->img_comp[i].coeff_w * z->img_comp[i].coeff_h * 64 * sizeof(short) + 15);
2782  z->img_comp[i].coeff = (short*) (((size_t) z->img_comp[i].raw_coeff + 15) & ~15);
2783  } else {
2784  z->img_comp[i].coeff = 0;
2785  z->img_comp[i].raw_coeff = 0;
2786  }
2787  }
2788 
2789  return 1;
2790 }
2791 
2792 // use comparisons since in some cases we handle more than one case (e.g. SOF)
2793 #define stbi__DNL(x) ((x) == 0xdc)
2794 #define stbi__SOI(x) ((x) == 0xd8)
2795 #define stbi__EOI(x) ((x) == 0xd9)
2796 #define stbi__SOF(x) ((x) == 0xc0 || (x) == 0xc1 || (x) == 0xc2)
2797 #define stbi__SOS(x) ((x) == 0xda)
2798 
2799 #define stbi__SOF_progressive(x) ((x) == 0xc2)
2800 
2801 static int stbi__decode_jpeg_header(stbi__jpeg *z, int scan)
2802 {
2803  int m;
2804  z->marker = STBI__MARKER_none; // initialize cached marker to empty
2805  m = stbi__get_marker(z);
2806  if (!stbi__SOI(m)) return stbi__err("no SOI","Corrupt JPEG");
2807  if (scan == STBI__SCAN_type) return 1;
2808  m = stbi__get_marker(z);
2809  while (!stbi__SOF(m)) {
2810  if (!stbi__process_marker(z,m)) return 0;
2811  m = stbi__get_marker(z);
2812  while (m == STBI__MARKER_none) {
2813  // some files have extra padding after their blocks, so ok, we'll scan
2814  if (stbi__at_eof(z->s)) return stbi__err("no SOF", "Corrupt JPEG");
2815  m = stbi__get_marker(z);
2816  }
2817  }
2818  z->progressive = stbi__SOF_progressive(m);
2819  if (!stbi__process_frame_header(z, scan)) return 0;
2820  return 1;
2821 }
2822 
2823 // decode image to YCbCr format
2824 static int stbi__decode_jpeg_image(stbi__jpeg *j)
2825 {
2826  int m;
2827  for (m = 0; m < 4; m++) {
2828  j->img_comp[m].raw_data = NULL;
2829  j->img_comp[m].raw_coeff = NULL;
2830  }
2831  j->restart_interval = 0;
2832  if (!stbi__decode_jpeg_header(j, STBI__SCAN_load)) return 0;
2833  m = stbi__get_marker(j);
2834  while (!stbi__EOI(m)) {
2835  if (stbi__SOS(m)) {
2836  if (!stbi__process_scan_header(j)) return 0;
2837  if (!stbi__parse_entropy_coded_data(j)) return 0;
2838  if (j->marker == STBI__MARKER_none ) {
2839  // handle 0s at the end of image data from IP Kamera 9060
2840  while (!stbi__at_eof(j->s)) {
2841  int x = stbi__get8(j->s);
2842  if (x == 255) {
2843  j->marker = stbi__get8(j->s);
2844  break;
2845  } else if (x != 0) {
2846  return stbi__err("junk before marker", "Corrupt JPEG");
2847  }
2848  }
2849  // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0
2850  }
2851  } else {
2852  if (!stbi__process_marker(j, m)) return 0;
2853  }
2854  m = stbi__get_marker(j);
2855  }
2856  if (j->progressive)
2857  stbi__jpeg_finish(j);
2858  return 1;
2859 }
2860 
2861 // static jfif-centered resampling (across block boundaries)
2862 
2863 typedef stbi_uc *(*resample_row_func)(stbi_uc *out, stbi_uc *in0, stbi_uc *in1,
2864  int w, int hs);
2865 
2866 #define stbi__div4(x) ((stbi_uc) ((x) >> 2))
2867 
2868 static stbi_uc *resample_row_1(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2869 {
2870  STBI_NOTUSED(out);
2871  STBI_NOTUSED(in_far);
2872  STBI_NOTUSED(w);
2873  STBI_NOTUSED(hs);
2874  return in_near;
2875 }
2876 
2877 static stbi_uc* stbi__resample_row_v_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2878 {
2879  // need to generate two samples vertically for every one in input
2880  int i;
2881  STBI_NOTUSED(hs);
2882  for (i=0; i < w; ++i)
2883  out[i] = stbi__div4(3*in_near[i] + in_far[i] + 2);
2884  return out;
2885 }
2886 
2887 static stbi_uc* stbi__resample_row_h_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2888 {
2889  // need to generate two samples horizontally for every one in input
2890  int i;
2891  stbi_uc *input = in_near;
2892 
2893  if (w == 1) {
2894  // if only one sample, can't do any interpolation
2895  out[0] = out[1] = input[0];
2896  return out;
2897  }
2898 
2899  out[0] = input[0];
2900  out[1] = stbi__div4(input[0]*3 + input[1] + 2);
2901  for (i=1; i < w-1; ++i) {
2902  int n = 3*input[i]+2;
2903  out[i*2+0] = stbi__div4(n+input[i-1]);
2904  out[i*2+1] = stbi__div4(n+input[i+1]);
2905  }
2906  out[i*2+0] = stbi__div4(input[w-2]*3 + input[w-1] + 2);
2907  out[i*2+1] = input[w-1];
2908 
2909  STBI_NOTUSED(in_far);
2910  STBI_NOTUSED(hs);
2911 
2912  return out;
2913 }
2914 
2915 #define stbi__div16(x) ((stbi_uc) ((x) >> 4))
2916 
2917 static stbi_uc *stbi__resample_row_hv_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2918 {
2919  // need to generate 2x2 samples for every one in input
2920  int i,t0,t1;
2921  if (w == 1) {
2922  out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2923  return out;
2924  }
2925 
2926  t1 = 3*in_near[0] + in_far[0];
2927  out[0] = stbi__div4(t1+2);
2928  for (i=1; i < w; ++i) {
2929  t0 = t1;
2930  t1 = 3*in_near[i]+in_far[i];
2931  out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
2932  out[i*2 ] = stbi__div16(3*t1 + t0 + 8);
2933  }
2934  out[w*2-1] = stbi__div4(t1+2);
2935 
2936  STBI_NOTUSED(hs);
2937 
2938  return out;
2939 }
2940 
2941 #if defined(STBI_SSE2) || defined(STBI_NEON)
2942 static stbi_uc *stbi__resample_row_hv_2_simd(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2943 {
2944  // need to generate 2x2 samples for every one in input
2945  int i=0,t0,t1;
2946 
2947  if (w == 1) {
2948  out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2949  return out;
2950  }
2951 
2952  t1 = 3*in_near[0] + in_far[0];
2953  // process groups of 8 pixels for as long as we can.
2954  // note we can't handle the last pixel in a row in this loop
2955  // because we need to handle the filter boundary conditions.
2956  for (; i < ((w-1) & ~7); i += 8) {
2957 #if defined(STBI_SSE2)
2958  // load and perform the vertical filtering pass
2959  // this uses 3*x + y = 4*x + (y - x)
2960  __m128i zero = _mm_setzero_si128();
2961  __m128i farb = _mm_loadl_epi64((__m128i *) (in_far + i));
2962  __m128i nearb = _mm_loadl_epi64((__m128i *) (in_near + i));
2963  __m128i farw = _mm_unpacklo_epi8(farb, zero);
2964  __m128i nearw = _mm_unpacklo_epi8(nearb, zero);
2965  __m128i diff = _mm_sub_epi16(farw, nearw);
2966  __m128i nears = _mm_slli_epi16(nearw, 2);
2967  __m128i curr = _mm_add_epi16(nears, diff); // current row
2968 
2969  // horizontal filter works the same based on shifted vers of current
2970  // row. "prev" is current row shifted right by 1 pixel; we need to
2971  // insert the previous pixel value (from t1).
2972  // "next" is current row shifted left by 1 pixel, with first pixel
2973  // of next block of 8 pixels added in.
2974  __m128i prv0 = _mm_slli_si128(curr, 2);
2975  __m128i nxt0 = _mm_srli_si128(curr, 2);
2976  __m128i prev = _mm_insert_epi16(prv0, t1, 0);
2977  __m128i next = _mm_insert_epi16(nxt0, 3*in_near[i+8] + in_far[i+8], 7);
2978 
2979  // horizontal filter, polyphase implementation since it's convenient:
2980  // even pixels = 3*cur + prev = cur*4 + (prev - cur)
2981  // odd pixels = 3*cur + next = cur*4 + (next - cur)
2982  // note the shared term.
2983  __m128i bias = _mm_set1_epi16(8);
2984  __m128i curs = _mm_slli_epi16(curr, 2);
2985  __m128i prvd = _mm_sub_epi16(prev, curr);
2986  __m128i nxtd = _mm_sub_epi16(next, curr);
2987  __m128i curb = _mm_add_epi16(curs, bias);
2988  __m128i even = _mm_add_epi16(prvd, curb);
2989  __m128i odd = _mm_add_epi16(nxtd, curb);
2990 
2991  // interleave even and odd pixels, then undo scaling.
2992  __m128i int0 = _mm_unpacklo_epi16(even, odd);
2993  __m128i int1 = _mm_unpackhi_epi16(even, odd);
2994  __m128i de0 = _mm_srli_epi16(int0, 4);
2995  __m128i de1 = _mm_srli_epi16(int1, 4);
2996 
2997  // pack and write output
2998  __m128i outv = _mm_packus_epi16(de0, de1);
2999  _mm_storeu_si128((__m128i *) (out + i*2), outv);
3000 #elif defined(STBI_NEON)
3001  // load and perform the vertical filtering pass
3002  // this uses 3*x + y = 4*x + (y - x)
3003  uint8x8_t farb = vld1_u8(in_far + i);
3004  uint8x8_t nearb = vld1_u8(in_near + i);
3005  int16x8_t diff = vreinterpretq_s16_u16(vsubl_u8(farb, nearb));
3006  int16x8_t nears = vreinterpretq_s16_u16(vshll_n_u8(nearb, 2));
3007  int16x8_t curr = vaddq_s16(nears, diff); // current row
3008 
3009  // horizontal filter works the same based on shifted vers of current
3010  // row. "prev" is current row shifted right by 1 pixel; we need to
3011  // insert the previous pixel value (from t1).
3012  // "next" is current row shifted left by 1 pixel, with first pixel
3013  // of next block of 8 pixels added in.
3014  int16x8_t prv0 = vextq_s16(curr, curr, 7);
3015  int16x8_t nxt0 = vextq_s16(curr, curr, 1);
3016  int16x8_t prev = vsetq_lane_s16(t1, prv0, 0);
3017  int16x8_t next = vsetq_lane_s16(3*in_near[i+8] + in_far[i+8], nxt0, 7);
3018 
3019  // horizontal filter, polyphase implementation since it's convenient:
3020  // even pixels = 3*cur + prev = cur*4 + (prev - cur)
3021  // odd pixels = 3*cur + next = cur*4 + (next - cur)
3022  // note the shared term.
3023  int16x8_t curs = vshlq_n_s16(curr, 2);
3024  int16x8_t prvd = vsubq_s16(prev, curr);
3025  int16x8_t nxtd = vsubq_s16(next, curr);
3026  int16x8_t even = vaddq_s16(curs, prvd);
3027  int16x8_t odd = vaddq_s16(curs, nxtd);
3028 
3029  // undo scaling and round, then store with even/odd phases interleaved
3030  uint8x8x2_t o;
3031  o.val[0] = vqrshrun_n_s16(even, 4);
3032  o.val[1] = vqrshrun_n_s16(odd, 4);
3033  vst2_u8(out + i*2, o);
3034 #endif
3035 
3036  // "previous" value for next iter
3037  t1 = 3*in_near[i+7] + in_far[i+7];
3038  }
3039 
3040  t0 = t1;
3041  t1 = 3*in_near[i] + in_far[i];
3042  out[i*2] = stbi__div16(3*t1 + t0 + 8);
3043 
3044  for (++i; i < w; ++i) {
3045  t0 = t1;
3046  t1 = 3*in_near[i]+in_far[i];
3047  out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
3048  out[i*2 ] = stbi__div16(3*t1 + t0 + 8);
3049  }
3050  out[w*2-1] = stbi__div4(t1+2);
3051 
3052  STBI_NOTUSED(hs);
3053 
3054  return out;
3055 }
3056 #endif
3057 
3058 static stbi_uc *stbi__resample_row_generic(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3059 {
3060  // resample with nearest-neighbor
3061  int i,j;
3062  STBI_NOTUSED(in_far);
3063  for (i=0; i < w; ++i)
3064  for (j=0; j < hs; ++j)
3065  out[i*hs+j] = in_near[i];
3066  return out;
3067 }
3068 
3069 #ifdef STBI_JPEG_OLD
3070 // this is the same YCbCr-to-RGB calculation that stb_image has used
3071 // historically before the algorithm changes in 1.49
3072 #define float2fixed(x) ((int) ((x) * 65536 + 0.5))
3073 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3074 {
3075  int i;
3076  for (i=0; i < count; ++i) {
3077  int y_fixed = (y[i] << 16) + 32768; // rounding
3078  int r,g,b;
3079  int cr = pcr[i] - 128;
3080  int cb = pcb[i] - 128;
3081  r = y_fixed + cr*float2fixed(1.40200f);
3082  g = y_fixed - cr*float2fixed(0.71414f) - cb*float2fixed(0.34414f);
3083  b = y_fixed + cb*float2fixed(1.77200f);
3084  r >>= 16;
3085  g >>= 16;
3086  b >>= 16;
3087  if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3088  if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3089  if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3090  out[0] = (stbi_uc)r;
3091  out[1] = (stbi_uc)g;
3092  out[2] = (stbi_uc)b;
3093  out[3] = 255;
3094  out += step;
3095  }
3096 }
3097 #else
3098 // this is a reduced-precision calculation of YCbCr-to-RGB introduced
3099 // to make sure the code produces the same results in both SIMD and scalar
3100 #define float2fixed(x) (((int) ((x) * 4096.0f + 0.5f)) << 8)
3101 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3102 {
3103  int i;
3104  for (i=0; i < count; ++i) {
3105  int y_fixed = (y[i] << 20) + (1<<19); // rounding
3106  int r,g,b;
3107  int cr = pcr[i] - 128;
3108  int cb = pcb[i] - 128;
3109  r = y_fixed + cr* float2fixed(1.40200f);
3110  g = y_fixed + (cr*-float2fixed(0.71414f)) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
3111  b = y_fixed + cb* float2fixed(1.77200f);
3112  r >>= 20;
3113  g >>= 20;
3114  b >>= 20;
3115  if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3116  if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3117  if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3118  out[0] = (stbi_uc)r;
3119  out[1] = (stbi_uc)g;
3120  out[2] = (stbi_uc)b;
3121  out[3] = 255;
3122  out += step;
3123  }
3124 }
3125 #endif
3126 
3127 #if defined(STBI_SSE2) || defined(STBI_NEON)
3128 static void stbi__YCbCr_to_RGB_simd(stbi_uc *out, stbi_uc const *y, stbi_uc const *pcb, stbi_uc const *pcr, int count, int step)
3129 {
3130  int i = 0;
3131 
3132 #ifdef STBI_SSE2
3133  // step == 3 is pretty ugly on the final interleave, and i'm not convinced
3134  // it's useful in practice (you wouldn't use it for textures, for example).
3135  // so just accelerate step == 4 case.
3136  if (step == 4) {
3137  // this is a fairly straightforward implementation and not super-optimized.
3138  __m128i signflip = _mm_set1_epi8(-0x80);
3139  __m128i cr_const0 = _mm_set1_epi16( (short) ( 1.40200f*4096.0f+0.5f));
3140  __m128i cr_const1 = _mm_set1_epi16( - (short) ( 0.71414f*4096.0f+0.5f));
3141  __m128i cb_const0 = _mm_set1_epi16( - (short) ( 0.34414f*4096.0f+0.5f));
3142  __m128i cb_const1 = _mm_set1_epi16( (short) ( 1.77200f*4096.0f+0.5f));
3143  __m128i y_bias = _mm_set1_epi8((char) (unsigned char) 128);
3144  __m128i xw = _mm_set1_epi16(255); // alpha channel
3145 
3146  for (; i+7 < count; i += 8) {
3147  // load
3148  __m128i y_bytes = _mm_loadl_epi64((__m128i *) (y+i));
3149  __m128i cr_bytes = _mm_loadl_epi64((__m128i *) (pcr+i));
3150  __m128i cb_bytes = _mm_loadl_epi64((__m128i *) (pcb+i));
3151  __m128i cr_biased = _mm_xor_si128(cr_bytes, signflip); // -128
3152  __m128i cb_biased = _mm_xor_si128(cb_bytes, signflip); // -128
3153 
3154  // unpack to short (and left-shift cr, cb by 8)
3155  __m128i yw = _mm_unpacklo_epi8(y_bias, y_bytes);
3156  __m128i crw = _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased);
3157  __m128i cbw = _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased);
3158 
3159  // color transform
3160  __m128i yws = _mm_srli_epi16(yw, 4);
3161  __m128i cr0 = _mm_mulhi_epi16(cr_const0, crw);
3162  __m128i cb0 = _mm_mulhi_epi16(cb_const0, cbw);
3163  __m128i cb1 = _mm_mulhi_epi16(cbw, cb_const1);
3164  __m128i cr1 = _mm_mulhi_epi16(crw, cr_const1);
3165  __m128i rws = _mm_add_epi16(cr0, yws);
3166  __m128i gwt = _mm_add_epi16(cb0, yws);
3167  __m128i bws = _mm_add_epi16(yws, cb1);
3168  __m128i gws = _mm_add_epi16(gwt, cr1);
3169 
3170  // descale
3171  __m128i rw = _mm_srai_epi16(rws, 4);
3172  __m128i bw = _mm_srai_epi16(bws, 4);
3173  __m128i gw = _mm_srai_epi16(gws, 4);
3174 
3175  // back to byte, set up for transpose
3176  __m128i brb = _mm_packus_epi16(rw, bw);
3177  __m128i gxb = _mm_packus_epi16(gw, xw);
3178 
3179  // transpose to interleave channels
3180  __m128i t0 = _mm_unpacklo_epi8(brb, gxb);
3181  __m128i t1 = _mm_unpackhi_epi8(brb, gxb);
3182  __m128i o0 = _mm_unpacklo_epi16(t0, t1);
3183  __m128i o1 = _mm_unpackhi_epi16(t0, t1);
3184 
3185  // store
3186  _mm_storeu_si128((__m128i *) (out + 0), o0);
3187  _mm_storeu_si128((__m128i *) (out + 16), o1);
3188  out += 32;
3189  }
3190  }
3191 #endif
3192 
3193 #ifdef STBI_NEON
3194  // in this version, step=3 support would be easy to add. but is there demand?
3195  if (step == 4) {
3196  // this is a fairly straightforward implementation and not super-optimized.
3197  uint8x8_t signflip = vdup_n_u8(0x80);
3198  int16x8_t cr_const0 = vdupq_n_s16( (short) ( 1.40200f*4096.0f+0.5f));
3199  int16x8_t cr_const1 = vdupq_n_s16( - (short) ( 0.71414f*4096.0f+0.5f));
3200  int16x8_t cb_const0 = vdupq_n_s16( - (short) ( 0.34414f*4096.0f+0.5f));
3201  int16x8_t cb_const1 = vdupq_n_s16( (short) ( 1.77200f*4096.0f+0.5f));
3202 
3203  for (; i+7 < count; i += 8) {
3204  // load
3205  uint8x8_t y_bytes = vld1_u8(y + i);
3206  uint8x8_t cr_bytes = vld1_u8(pcr + i);
3207  uint8x8_t cb_bytes = vld1_u8(pcb + i);
3208  int8x8_t cr_biased = vreinterpret_s8_u8(vsub_u8(cr_bytes, signflip));
3209  int8x8_t cb_biased = vreinterpret_s8_u8(vsub_u8(cb_bytes, signflip));
3210 
3211  // expand to s16
3212  int16x8_t yws = vreinterpretq_s16_u16(vshll_n_u8(y_bytes, 4));
3213  int16x8_t crw = vshll_n_s8(cr_biased, 7);
3214  int16x8_t cbw = vshll_n_s8(cb_biased, 7);
3215 
3216  // color transform
3217  int16x8_t cr0 = vqdmulhq_s16(crw, cr_const0);
3218  int16x8_t cb0 = vqdmulhq_s16(cbw, cb_const0);
3219  int16x8_t cr1 = vqdmulhq_s16(crw, cr_const1);
3220  int16x8_t cb1 = vqdmulhq_s16(cbw, cb_const1);
3221  int16x8_t rws = vaddq_s16(yws, cr0);
3222  int16x8_t gws = vaddq_s16(vaddq_s16(yws, cb0), cr1);
3223  int16x8_t bws = vaddq_s16(yws, cb1);
3224 
3225  // undo scaling, round, convert to byte
3226  uint8x8x4_t o;
3227  o.val[0] = vqrshrun_n_s16(rws, 4);
3228  o.val[1] = vqrshrun_n_s16(gws, 4);
3229  o.val[2] = vqrshrun_n_s16(bws, 4);
3230  o.val[3] = vdup_n_u8(255);
3231 
3232  // store, interleaving r/g/b/a
3233  vst4_u8(out, o);
3234  out += 8*4;
3235  }
3236  }
3237 #endif
3238 
3239  for (; i < count; ++i) {
3240  int y_fixed = (y[i] << 20) + (1<<19); // rounding
3241  int r,g,b;
3242  int cr = pcr[i] - 128;
3243  int cb = pcb[i] - 128;
3244  r = y_fixed + cr* float2fixed(1.40200f);
3245  g = y_fixed + cr*-float2fixed(0.71414f) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
3246  b = y_fixed + cb* float2fixed(1.77200f);
3247  r >>= 20;
3248  g >>= 20;
3249  b >>= 20;
3250  if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3251  if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3252  if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3253  out[0] = (stbi_uc)r;
3254  out[1] = (stbi_uc)g;
3255  out[2] = (stbi_uc)b;
3256  out[3] = 255;
3257  out += step;
3258  }
3259 }
3260 #endif
3261 
3262 // set up the kernels
3263 static void stbi__setup_jpeg(stbi__jpeg *j)
3264 {
3265  j->idct_block_kernel = stbi__idct_block;
3266  j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_row;
3267  j->resample_row_hv_2_kernel = stbi__resample_row_hv_2;
3268 
3269 #ifdef STBI_SSE2
3270  if (stbi__sse2_available()) {
3271  j->idct_block_kernel = stbi__idct_simd;
3272  #ifndef STBI_JPEG_OLD
3273  j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3274  #endif
3275  j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3276  }
3277 #endif
3278 
3279 #ifdef STBI_NEON
3280  j->idct_block_kernel = stbi__idct_simd;
3281  #ifndef STBI_JPEG_OLD
3282  j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3283  #endif
3284  j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3285 #endif
3286 }
3287 
3288 // clean up the temporary component buffers
3289 static void stbi__cleanup_jpeg(stbi__jpeg *j)
3290 {
3291  int i;
3292  for (i=0; i < j->s->img_n; ++i) {
3293  if (j->img_comp[i].raw_data) {
3294  STBI_FREE(j->img_comp[i].raw_data);
3295  j->img_comp[i].raw_data = NULL;
3296  j->img_comp[i].data = NULL;
3297  }
3298  if (j->img_comp[i].raw_coeff) {
3299  STBI_FREE(j->img_comp[i].raw_coeff);
3300  j->img_comp[i].raw_coeff = 0;
3301  j->img_comp[i].coeff = 0;
3302  }
3303  if (j->img_comp[i].linebuf) {
3304  STBI_FREE(j->img_comp[i].linebuf);
3305  j->img_comp[i].linebuf = NULL;
3306  }
3307  }
3308 }
3309 
3310 typedef struct
3311 {
3312  resample_row_func resample;
3313  stbi_uc *line0,*line1;
3314  int hs,vs; // expansion factor in each axis
3315  int w_lores; // horizontal pixels pre-expansion
3316  int ystep; // how far through vertical expansion we are
3317  int ypos; // which pre-expansion row we're on
3318 } stbi__resample;
3319 
3320 static stbi_uc *load_jpeg_image(stbi__jpeg *z, int *out_x, int *out_y, int *comp, int req_comp)
3321 {
3322  int n, decode_n;
3323  z->s->img_n = 0; // make stbi__cleanup_jpeg safe
3324 
3325  // validate req_comp
3326  if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
3327 
3328  // load a jpeg image from whichever source, but leave in YCbCr format
3329  if (!stbi__decode_jpeg_image(z)) { stbi__cleanup_jpeg(z); return NULL; }
3330 
3331  // determine actual number of components to generate
3332  n = req_comp ? req_comp : z->s->img_n;
3333 
3334  if (z->s->img_n == 3 && n < 3)
3335  decode_n = 1;
3336  else
3337  decode_n = z->s->img_n;
3338 
3339  // resample and color-convert
3340  {
3341  int k;
3342  unsigned int i,j;
3343  stbi_uc *output;
3344  stbi_uc *coutput[4];
3345 
3346  stbi__resample res_comp[4];
3347 
3348  for (k=0; k < decode_n; ++k) {
3349  stbi__resample *r = &res_comp[k];
3350 
3351  // allocate line buffer big enough for upsampling off the edges
3352  // with upsample factor of 4
3353  z->img_comp[k].linebuf = (stbi_uc *) stbi__malloc(z->s->img_x + 3);
3354  if (!z->img_comp[k].linebuf) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3355 
3356  r->hs = z->img_h_max / z->img_comp[k].h;
3357  r->vs = z->img_v_max / z->img_comp[k].v;
3358  r->ystep = r->vs >> 1;
3359  r->w_lores = (z->s->img_x + r->hs-1) / r->hs;
3360  r->ypos = 0;
3361  r->line0 = r->line1 = z->img_comp[k].data;
3362 
3363  if (r->hs == 1 && r->vs == 1) r->resample = resample_row_1;
3364  else if (r->hs == 1 && r->vs == 2) r->resample = stbi__resample_row_v_2;
3365  else if (r->hs == 2 && r->vs == 1) r->resample = stbi__resample_row_h_2;
3366  else if (r->hs == 2 && r->vs == 2) r->resample = z->resample_row_hv_2_kernel;
3367  else r->resample = stbi__resample_row_generic;
3368  }
3369 
3370  // can't error after this so, this is safe
3371  output = (stbi_uc *) stbi__malloc(n * z->s->img_x * z->s->img_y + 1);
3372  if (!output) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3373 
3374  // now go ahead and resample
3375  for (j=0; j < z->s->img_y; ++j) {
3376  stbi_uc *out = output + n * z->s->img_x * j;
3377  for (k=0; k < decode_n; ++k) {
3378  stbi__resample *r = &res_comp[k];
3379  int y_bot = r->ystep >= (r->vs >> 1);
3380  coutput[k] = r->resample(z->img_comp[k].linebuf,
3381  y_bot ? r->line1 : r->line0,
3382  y_bot ? r->line0 : r->line1,
3383  r->w_lores, r->hs);
3384  if (++r->ystep >= r->vs) {
3385  r->ystep = 0;
3386  r->line0 = r->line1;
3387  if (++r->ypos < z->img_comp[k].y)
3388  r->line1 += z->img_comp[k].w2;
3389  }
3390  }
3391  if (n >= 3) {
3392  stbi_uc *y = coutput[0];
3393  if (z->s->img_n == 3) {
3394  if (z->rgb == 3) {
3395  for (i=0; i < z->s->img_x; ++i) {
3396  out[0] = y[i];
3397  out[1] = coutput[1][i];
3398  out[2] = coutput[2][i];
3399  out[3] = 255;
3400  out += n;
3401  }
3402  } else {
3403  z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
3404  }
3405  } else
3406  for (i=0; i < z->s->img_x; ++i) {
3407  out[0] = out[1] = out[2] = y[i];
3408  out[3] = 255; // not used if n==3
3409  out += n;
3410  }
3411  } else {
3412  stbi_uc *y = coutput[0];
3413  if (n == 1)
3414  for (i=0; i < z->s->img_x; ++i) out[i] = y[i];
3415  else
3416  for (i=0; i < z->s->img_x; ++i) *out++ = y[i], *out++ = 255;
3417  }
3418  }
3419  stbi__cleanup_jpeg(z);
3420  *out_x = z->s->img_x;
3421  *out_y = z->s->img_y;
3422  if (comp) *comp = z->s->img_n; // report original components, not output
3423  return output;
3424  }
3425 }
3426 
3427 static unsigned char *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
3428 {
3429  unsigned char* result;
3430  stbi__jpeg* j = (stbi__jpeg*) stbi__malloc(sizeof(stbi__jpeg));
3431  j->s = s;
3432  stbi__setup_jpeg(j);
3433  result = load_jpeg_image(j, x,y,comp,req_comp);
3434  STBI_FREE(j);
3435  return result;
3436 }
3437 
3438 static int stbi__jpeg_test(stbi__context *s)
3439 {
3440  int r;
3441  stbi__jpeg j;
3442  j.s = s;
3443  stbi__setup_jpeg(&j);
3444  r = stbi__decode_jpeg_header(&j, STBI__SCAN_type);
3445  stbi__rewind(s);
3446  return r;
3447 }
3448 
3449 static int stbi__jpeg_info_raw(stbi__jpeg *j, int *x, int *y, int *comp)
3450 {
3451  if (!stbi__decode_jpeg_header(j, STBI__SCAN_header)) {
3452  stbi__rewind( j->s );
3453  return 0;
3454  }
3455  if (x) *x = j->s->img_x;
3456  if (y) *y = j->s->img_y;
3457  if (comp) *comp = j->s->img_n;
3458  return 1;
3459 }
3460 
3461 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp)
3462 {
3463  int result;
3464  stbi__jpeg* j = (stbi__jpeg*) (stbi__malloc(sizeof(stbi__jpeg)));
3465  j->s = s;
3466  result = stbi__jpeg_info_raw(j, x, y, comp);
3467  STBI_FREE(j);
3468  return result;
3469 }
3470 #endif
3471 
3472 // public domain zlib decode v0.2 Sean Barrett 2006-11-18
3473 // simple implementation
3474 // - all input must be provided in an upfront buffer
3475 // - all output is written to a single output buffer (can malloc/realloc)
3476 // performance
3477 // - fast huffman
3478 
3479 #ifndef STBI_NO_ZLIB
3480 
3481 // fast-way is faster to check than jpeg huffman, but slow way is slower
3482 #define STBI__ZFAST_BITS 9 // accelerate all cases in default tables
3483 #define STBI__ZFAST_MASK ((1 << STBI__ZFAST_BITS) - 1)
3484 
3485 // zlib-style huffman encoding
3486 // (jpegs packs from left, zlib from right, so can't share code)
3487 typedef struct
3488 {
3489  stbi__uint16 fast[1 << STBI__ZFAST_BITS];
3490  stbi__uint16 firstcode[16];
3491  int maxcode[17];
3492  stbi__uint16 firstsymbol[16];
3493  stbi_uc size[288];
3494  stbi__uint16 value[288];
3495 } stbi__zhuffman;
3496 
3497 stbi_inline static int stbi__bitreverse16(int n)
3498 {
3499  n = ((n & 0xAAAA) >> 1) | ((n & 0x5555) << 1);
3500  n = ((n & 0xCCCC) >> 2) | ((n & 0x3333) << 2);
3501  n = ((n & 0xF0F0) >> 4) | ((n & 0x0F0F) << 4);
3502  n = ((n & 0xFF00) >> 8) | ((n & 0x00FF) << 8);
3503  return n;
3504 }
3505 
3506 stbi_inline static int stbi__bit_reverse(int v, int bits)
3507 {
3508  STBI_ASSERT(bits <= 16);
3509  // to bit reverse n bits, reverse 16 and shift
3510  // e.g. 11 bits, bit reverse and shift away 5
3511  return stbi__bitreverse16(v) >> (16-bits);
3512 }
3513 
3514 static int stbi__zbuild_huffman(stbi__zhuffman *z, stbi_uc *sizelist, int num)
3515 {
3516  int i,k=0;
3517  int code, next_code[16], sizes[17];
3518 
3519  // DEFLATE spec for generating codes
3520  memset(sizes, 0, sizeof(sizes));
3521  memset(z->fast, 0, sizeof(z->fast));
3522  for (i=0; i < num; ++i)
3523  ++sizes[sizelist[i]];
3524  sizes[0] = 0;
3525  for (i=1; i < 16; ++i)
3526  if (sizes[i] > (1 << i))
3527  return stbi__err("bad sizes", "Corrupt PNG");
3528  code = 0;
3529  for (i=1; i < 16; ++i) {
3530  next_code[i] = code;
3531  z->firstcode[i] = (stbi__uint16) code;
3532  z->firstsymbol[i] = (stbi__uint16) k;
3533  code = (code + sizes[i]);
3534  if (sizes[i])
3535  if (code-1 >= (1 << i)) return stbi__err("bad codelengths","Corrupt PNG");
3536  z->maxcode[i] = code << (16-i); // preshift for inner loop
3537  code <<= 1;
3538  k += sizes[i];
3539  }
3540  z->maxcode[16] = 0x10000; // sentinel
3541  for (i=0; i < num; ++i) {
3542  int s = sizelist[i];
3543  if (s) {
3544  int c = next_code[s] - z->firstcode[s] + z->firstsymbol[s];
3545  stbi__uint16 fastv = (stbi__uint16) ((s << 9) | i);
3546  z->size [c] = (stbi_uc ) s;
3547  z->value[c] = (stbi__uint16) i;
3548  if (s <= STBI__ZFAST_BITS) {
3549  int j = stbi__bit_reverse(next_code[s],s);
3550  while (j < (1 << STBI__ZFAST_BITS)) {
3551  z->fast[j] = fastv;
3552  j += (1 << s);
3553  }
3554  }
3555  ++next_code[s];
3556  }
3557  }
3558  return 1;
3559 }
3560 
3561 // zlib-from-memory implementation for PNG reading
3562 // because PNG allows splitting the zlib stream arbitrarily,
3563 // and it's annoying structurally to have PNG call ZLIB call PNG,
3564 // we require PNG read all the IDATs and combine them into a single
3565 // memory buffer
3566 
3567 typedef struct
3568 {
3569  stbi_uc *zbuffer, *zbuffer_end;
3570  int num_bits;
3571  stbi__uint32 code_buffer;
3572 
3573  char *zout;
3574  char *zout_start;
3575  char *zout_end;
3576  int z_expandable;
3577 
3578  stbi__zhuffman z_length, z_distance;
3579 } stbi__zbuf;
3580 
3581 stbi_inline static stbi_uc stbi__zget8(stbi__zbuf *z)
3582 {
3583  if (z->zbuffer >= z->zbuffer_end) return 0;
3584  return *z->zbuffer++;
3585 }
3586 
3587 static void stbi__fill_bits(stbi__zbuf *z)
3588 {
3589  do {
3590  STBI_ASSERT(z->code_buffer < (1U << z->num_bits));
3591  z->code_buffer |= (unsigned int) stbi__zget8(z) << z->num_bits;
3592  z->num_bits += 8;
3593  } while (z->num_bits <= 24);
3594 }
3595 
3596 stbi_inline static unsigned int stbi__zreceive(stbi__zbuf *z, int n)
3597 {
3598  unsigned int k;
3599  if (z->num_bits < n) stbi__fill_bits(z);
3600  k = z->code_buffer & ((1 << n) - 1);
3601  z->code_buffer >>= n;
3602  z->num_bits -= n;
3603  return k;
3604 }
3605 
3606 static int stbi__zhuffman_decode_slowpath(stbi__zbuf *a, stbi__zhuffman *z)
3607 {
3608  int b,s,k;
3609  // not resolved by fast table, so compute it the slow way
3610  // use jpeg approach, which requires MSbits at top
3611  k = stbi__bit_reverse(a->code_buffer, 16);
3612  for (s=STBI__ZFAST_BITS+1; ; ++s)
3613  if (k < z->maxcode[s])
3614  break;
3615  if (s == 16) return -1; // invalid code!
3616  // code size is s, so:
3617  b = (k >> (16-s)) - z->firstcode[s] + z->firstsymbol[s];
3618  STBI_ASSERT(z->size[b] == s);
3619  a->code_buffer >>= s;
3620  a->num_bits -= s;
3621  return z->value[b];
3622 }
3623 
3624 stbi_inline static int stbi__zhuffman_decode(stbi__zbuf *a, stbi__zhuffman *z)
3625 {
3626  int b,s;
3627  if (a->num_bits < 16) stbi__fill_bits(a);
3628  b = z->fast[a->code_buffer & STBI__ZFAST_MASK];
3629  if (b) {
3630  s = b >> 9;
3631  a->code_buffer >>= s;
3632  a->num_bits -= s;
3633  return b & 511;
3634  }
3635  return stbi__zhuffman_decode_slowpath(a, z);
3636 }
3637 
3638 static int stbi__zexpand(stbi__zbuf *z, char *zout, int n) // need to make room for n bytes
3639 {
3640  char *q;
3641  int cur, limit, old_limit;
3642  z->zout = zout;
3643  if (!z->z_expandable) return stbi__err("output buffer limit","Corrupt PNG");
3644  cur = (int) (z->zout - z->zout_start);
3645  limit = old_limit = (int) (z->zout_end - z->zout_start);
3646  while (cur + n > limit)
3647  limit *= 2;
3648  q = (char *) STBI_REALLOC_SIZED(z->zout_start, old_limit, limit);
3649  STBI_NOTUSED(old_limit);
3650  if (q == NULL) return stbi__err("outofmem", "Out of memory");
3651  z->zout_start = q;
3652  z->zout = q + cur;
3653  z->zout_end = q + limit;
3654  return 1;
3655 }
3656 
3657 static int stbi__zlength_base[31] = {
3658  3,4,5,6,7,8,9,10,11,13,
3659  15,17,19,23,27,31,35,43,51,59,
3660  67,83,99,115,131,163,195,227,258,0,0 };
3661 
3662 static int stbi__zlength_extra[31]=
3663 { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
3664 
3665 static int stbi__zdist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
3666 257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};
3667 
3668 static int stbi__zdist_extra[32] =
3669 { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};
3670 
3671 static int stbi__parse_huffman_block(stbi__zbuf *a)
3672 {
3673  char *zout = a->zout;
3674  for(;;) {
3675  int z = stbi__zhuffman_decode(a, &a->z_length);
3676  if (z < 256) {
3677  if (z < 0) return stbi__err("bad huffman code","Corrupt PNG"); // error in huffman codes
3678  if (zout >= a->zout_end) {
3679  if (!stbi__zexpand(a, zout, 1)) return 0;
3680  zout = a->zout;
3681  }
3682  *zout++ = (char) z;
3683  } else {
3684  stbi_uc *p;
3685  int len,dist;
3686  if (z == 256) {
3687  a->zout = zout;
3688  return 1;
3689  }
3690  z -= 257;
3691  len = stbi__zlength_base[z];
3692  if (stbi__zlength_extra[z]) len += stbi__zreceive(a, stbi__zlength_extra[z]);
3693  z = stbi__zhuffman_decode(a, &a->z_distance);
3694  if (z < 0) return stbi__err("bad huffman code","Corrupt PNG");
3695  dist = stbi__zdist_base[z];
3696  if (stbi__zdist_extra[z]) dist += stbi__zreceive(a, stbi__zdist_extra[z]);
3697  if (zout - a->zout_start < dist) return stbi__err("bad dist","Corrupt PNG");
3698  if (zout + len > a->zout_end) {
3699  if (!stbi__zexpand(a, zout, len)) return 0;
3700  zout = a->zout;
3701  }
3702  p = (stbi_uc *) (zout - dist);
3703  if (dist == 1) { // run of one byte; common in images.
3704  stbi_uc v = *p;
3705  if (len) { do *zout++ = v; while (--len); }
3706  } else {
3707  if (len) { do *zout++ = *p++; while (--len); }
3708  }
3709  }
3710  }
3711 }
3712 
3713 static int stbi__compute_huffman_codes(stbi__zbuf *a)
3714 {
3715  static stbi_uc length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
3716  stbi__zhuffman z_codelength;
3717  stbi_uc lencodes[286+32+137];//padding for maximum single op
3718  stbi_uc codelength_sizes[19];
3719  int i,n;
3720 
3721  int hlit = stbi__zreceive(a,5) + 257;
3722  int hdist = stbi__zreceive(a,5) + 1;
3723  int hclen = stbi__zreceive(a,4) + 4;
3724 
3725  memset(codelength_sizes, 0, sizeof(codelength_sizes));
3726  for (i=0; i < hclen; ++i) {
3727  int s = stbi__zreceive(a,3);
3728  codelength_sizes[length_dezigzag[i]] = (stbi_uc) s;
3729  }
3730  if (!stbi__zbuild_huffman(&z_codelength, codelength_sizes, 19)) return 0;
3731 
3732  n = 0;
3733  while (n < hlit + hdist) {
3734  int c = stbi__zhuffman_decode(a, &z_codelength);
3735  if (c < 0 || c >= 19) return stbi__err("bad codelengths", "Corrupt PNG");
3736  if (c < 16)
3737  lencodes[n++] = (stbi_uc) c;
3738  else if (c == 16) {
3739  c = stbi__zreceive(a,2)+3;
3740  memset(lencodes+n, lencodes[n-1], c);
3741  n += c;
3742  } else if (c == 17) {
3743  c = stbi__zreceive(a,3)+3;
3744  memset(lencodes+n, 0, c);
3745  n += c;
3746  } else {
3747  STBI_ASSERT(c == 18);
3748  c = stbi__zreceive(a,7)+11;
3749  memset(lencodes+n, 0, c);
3750  n += c;
3751  }
3752  }
3753  if (n != hlit+hdist) return stbi__err("bad codelengths","Corrupt PNG");
3754  if (!stbi__zbuild_huffman(&a->z_length, lencodes, hlit)) return 0;
3755  if (!stbi__zbuild_huffman(&a->z_distance, lencodes+hlit, hdist)) return 0;
3756  return 1;
3757 }
3758 
3759 static int stbi__parse_uncompressed_block(stbi__zbuf *a)
3760 {
3761  stbi_uc header[4];
3762  int len,nlen,k;
3763  if (a->num_bits & 7)
3764  stbi__zreceive(a, a->num_bits & 7); // discard
3765  // drain the bit-packed data into header
3766  k = 0;
3767  while (a->num_bits > 0) {
3768  header[k++] = (stbi_uc) (a->code_buffer & 255); // suppress MSVC run-time check
3769  a->code_buffer >>= 8;
3770  a->num_bits -= 8;
3771  }
3772  STBI_ASSERT(a->num_bits == 0);
3773  // now fill header the normal way
3774  while (k < 4)
3775  header[k++] = stbi__zget8(a);
3776  len = header[1] * 256 + header[0];
3777  nlen = header[3] * 256 + header[2];
3778  if (nlen != (len ^ 0xffff)) return stbi__err("zlib corrupt","Corrupt PNG");
3779  if (a->zbuffer + len > a->zbuffer_end) return stbi__err("read past buffer","Corrupt PNG");
3780  if (a->zout + len > a->zout_end)
3781  if (!stbi__zexpand(a, a->zout, len)) return 0;
3782  memcpy(a->zout, a->zbuffer, len);
3783  a->zbuffer += len;
3784  a->zout += len;
3785  return 1;
3786 }
3787 
3788 static int stbi__parse_zlib_header(stbi__zbuf *a)
3789 {
3790  int cmf = stbi__zget8(a);
3791  int cm = cmf & 15;
3792  /* int cinfo = cmf >> 4; */
3793  int flg = stbi__zget8(a);
3794  if ((cmf*256+flg) % 31 != 0) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
3795  if (flg & 32) return stbi__err("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png
3796  if (cm != 8) return stbi__err("bad compression","Corrupt PNG"); // DEFLATE required for png
3797  // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
3798  return 1;
3799 }
3800 
3801 // @TODO: should statically initialize these for optimal thread safety
3802 static stbi_uc stbi__zdefault_length[288], stbi__zdefault_distance[32];
3803 static void stbi__init_zdefaults(void)
3804 {
3805  int i; // use <= to match clearly with spec
3806  for (i=0; i <= 143; ++i) stbi__zdefault_length[i] = 8;
3807  for ( ; i <= 255; ++i) stbi__zdefault_length[i] = 9;
3808  for ( ; i <= 279; ++i) stbi__zdefault_length[i] = 7;
3809  for ( ; i <= 287; ++i) stbi__zdefault_length[i] = 8;
3810 
3811  for (i=0; i <= 31; ++i) stbi__zdefault_distance[i] = 5;
3812 }
3813 
3814 static int stbi__parse_zlib(stbi__zbuf *a, int parse_header)
3815 {
3816  int final, type;
3817  if (parse_header)
3818  if (!stbi__parse_zlib_header(a)) return 0;
3819  a->num_bits = 0;
3820  a->code_buffer = 0;
3821  do {
3822  final = stbi__zreceive(a,1);
3823  type = stbi__zreceive(a,2);
3824  if (type == 0) {
3825  if (!stbi__parse_uncompressed_block(a)) return 0;
3826  } else if (type == 3) {
3827  return 0;
3828  } else {
3829  if (type == 1) {
3830  // use fixed code lengths
3831  if (!stbi__zdefault_distance[31]) stbi__init_zdefaults();
3832  if (!stbi__zbuild_huffman(&a->z_length , stbi__zdefault_length , 288)) return 0;
3833  if (!stbi__zbuild_huffman(&a->z_distance, stbi__zdefault_distance, 32)) return 0;
3834  } else {
3835  if (!stbi__compute_huffman_codes(a)) return 0;
3836  }
3837  if (!stbi__parse_huffman_block(a)) return 0;
3838  }
3839  } while (!final);
3840  return 1;
3841 }
3842 
3843 static int stbi__do_zlib(stbi__zbuf *a, char *obuf, int olen, int exp, int parse_header)
3844 {
3845  a->zout_start = obuf;
3846  a->zout = obuf;
3847  a->zout_end = obuf + olen;
3848  a->z_expandable = exp;
3849 
3850  return stbi__parse_zlib(a, parse_header);
3851 }
3852 
3853 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen)
3854 {
3855  stbi__zbuf a;
3856  char *p = (char *) stbi__malloc(initial_size);
3857  if (p == NULL) return NULL;
3858  a.zbuffer = (stbi_uc *) buffer;
3859  a.zbuffer_end = (stbi_uc *) buffer + len;
3860  if (stbi__do_zlib(&a, p, initial_size, 1, 1)) {
3861  if (outlen) *outlen = (int) (a.zout - a.zout_start);
3862  return a.zout_start;
3863  } else {
3864  STBI_FREE(a.zout_start);
3865  return NULL;
3866  }
3867 }
3868 
3869 STBIDEF char *stbi_zlib_decode_malloc(char const *buffer, int len, int *outlen)
3870 {
3871  return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen);
3872 }
3873 
3874 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header)
3875 {
3876  stbi__zbuf a;
3877  char *p = (char *) stbi__malloc(initial_size);
3878  if (p == NULL) return NULL;
3879  a.zbuffer = (stbi_uc *) buffer;
3880  a.zbuffer_end = (stbi_uc *) buffer + len;
3881  if (stbi__do_zlib(&a, p, initial_size, 1, parse_header)) {
3882  if (outlen) *outlen = (int) (a.zout - a.zout_start);
3883  return a.zout_start;
3884  } else {
3885  STBI_FREE(a.zout_start);
3886  return NULL;
3887  }
3888 }
3889 
3890 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, char const *ibuffer, int ilen)
3891 {
3892  stbi__zbuf a;
3893  a.zbuffer = (stbi_uc *) ibuffer;
3894  a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
3895  if (stbi__do_zlib(&a, obuffer, olen, 0, 1))
3896  return (int) (a.zout - a.zout_start);
3897  else
3898  return -1;
3899 }
3900 
3901 STBIDEF char *stbi_zlib_decode_noheader_malloc(char const *buffer, int len, int *outlen)
3902 {
3903  stbi__zbuf a;
3904  char *p = (char *) stbi__malloc(16384);
3905  if (p == NULL) return NULL;
3906  a.zbuffer = (stbi_uc *) buffer;
3907  a.zbuffer_end = (stbi_uc *) buffer+len;
3908  if (stbi__do_zlib(&a, p, 16384, 1, 0)) {
3909  if (outlen) *outlen = (int) (a.zout - a.zout_start);
3910  return a.zout_start;
3911  } else {
3912  STBI_FREE(a.zout_start);
3913  return NULL;
3914  }
3915 }
3916 
3917 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen)
3918 {
3919  stbi__zbuf a;
3920  a.zbuffer = (stbi_uc *) ibuffer;
3921  a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
3922  if (stbi__do_zlib(&a, obuffer, olen, 0, 0))
3923  return (int) (a.zout - a.zout_start);
3924  else
3925  return -1;
3926 }
3927 #endif
3928 
3929 // public domain "baseline" PNG decoder v0.10 Sean Barrett 2006-11-18
3930 // simple implementation
3931 // - only 8-bit samples
3932 // - no CRC checking
3933 // - allocates lots of intermediate memory
3934 // - avoids problem of streaming data between subsystems
3935 // - avoids explicit window management
3936 // performance
3937 // - uses stb_zlib, a PD zlib implementation with fast huffman decoding
3938 
3939 #ifndef STBI_NO_PNG
3940 typedef struct
3941 {
3942  stbi__uint32 length;
3943  stbi__uint32 type;
3944 } stbi__pngchunk;
3945 
3946 static stbi__pngchunk stbi__get_chunk_header(stbi__context *s)
3947 {
3948  stbi__pngchunk c;
3949  c.length = stbi__get32be(s);
3950  c.type = stbi__get32be(s);
3951  return c;
3952 }
3953 
3954 static int stbi__check_png_header(stbi__context *s)
3955 {
3956  static stbi_uc png_sig[8] = { 137,80,78,71,13,10,26,10 };
3957  int i;
3958  for (i=0; i < 8; ++i)
3959  if (stbi__get8(s) != png_sig[i]) return stbi__err("bad png sig","Not a PNG");
3960  return 1;
3961 }
3962 
3963 typedef struct
3964 {
3965  stbi__context *s;
3966  stbi_uc *idata, *expanded, *out;
3967  int depth;
3968 } stbi__png;
3969 
3970 
3971 enum {
3972  STBI__F_none=0,
3973  STBI__F_sub=1,
3974  STBI__F_up=2,
3975  STBI__F_avg=3,
3976  STBI__F_paeth=4,
3977  // synthetic filters used for first scanline to avoid needing a dummy row of 0s
3978  STBI__F_avg_first,
3979  STBI__F_paeth_first
3980 };
3981 
3982 static stbi_uc first_row_filter[5] =
3983 {
3984  STBI__F_none,
3985  STBI__F_sub,
3986  STBI__F_none,
3987  STBI__F_avg_first,
3988  STBI__F_paeth_first
3989 };
3990 
3991 static int stbi__paeth(int a, int b, int c)
3992 {
3993  int p = a + b - c;
3994  int pa = abs(p-a);
3995  int pb = abs(p-b);
3996  int pc = abs(p-c);
3997  if (pa <= pb && pa <= pc) return a;
3998  if (pb <= pc) return b;
3999  return c;
4000 }
4001 
4002 static stbi_uc stbi__depth_scale_table[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 };
4003 
4004 // create the png data from post-deflated data
4005 static int stbi__create_png_image_raw(stbi__png *a, stbi_uc *raw, stbi__uint32 raw_len, int out_n, stbi__uint32 x, stbi__uint32 y, int depth, int color)
4006 {
4007  int bytes = (depth == 16? 2 : 1);
4008  stbi__context *s = a->s;
4009  stbi__uint32 i,j,stride = x*out_n*bytes;
4010  stbi__uint32 img_len, img_width_bytes;
4011  int k;
4012  int img_n = s->img_n; // copy it into a local for later
4013 
4014  int output_bytes = out_n*bytes;
4015  int filter_bytes = img_n*bytes;
4016  int width = x;
4017 
4018  STBI_ASSERT(out_n == s->img_n || out_n == s->img_n+1);
4019  a->out = (stbi_uc *) stbi__malloc(x * y * output_bytes); // extra bytes to write off the end into
4020  if (!a->out) return stbi__err("outofmem", "Out of memory");
4021 
4022  img_width_bytes = (((img_n * x * depth) + 7) >> 3);
4023  img_len = (img_width_bytes + 1) * y;
4024  if (s->img_x == x && s->img_y == y) {
4025  if (raw_len != img_len) return stbi__err("not enough pixels","Corrupt PNG");
4026  } else { // interlaced:
4027  if (raw_len < img_len) return stbi__err("not enough pixels","Corrupt PNG");
4028  }
4029 
4030  for (j=0; j < y; ++j) {
4031  stbi_uc *cur = a->out + stride*j;
4032  stbi_uc *prior = cur - stride;
4033  int filter = *raw++;
4034 
4035  if (filter > 4)
4036  return stbi__err("invalid filter","Corrupt PNG");
4037 
4038  if (depth < 8) {
4039  STBI_ASSERT(img_width_bytes <= x);
4040  cur += x*out_n - img_width_bytes; // store output to the rightmost img_len bytes, so we can decode in place
4041  filter_bytes = 1;
4042  width = img_width_bytes;
4043  }
4044 
4045  // if first row, use special filter that doesn't sample previous row
4046  if (j == 0) filter = first_row_filter[filter];
4047 
4048  // handle first byte explicitly
4049  for (k=0; k < filter_bytes; ++k) {
4050  switch (filter) {
4051  case STBI__F_none : cur[k] = raw[k]; break;
4052  case STBI__F_sub : cur[k] = raw[k]; break;
4053  case STBI__F_up : cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4054  case STBI__F_avg : cur[k] = STBI__BYTECAST(raw[k] + (prior[k]>>1)); break;
4055  case STBI__F_paeth : cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(0,prior[k],0)); break;
4056  case STBI__F_avg_first : cur[k] = raw[k]; break;
4057  case STBI__F_paeth_first: cur[k] = raw[k]; break;
4058  }
4059  }
4060 
4061  if (depth == 8) {
4062  if (img_n != out_n)
4063  cur[img_n] = 255; // first pixel
4064  raw += img_n;
4065  cur += out_n;
4066  prior += out_n;
4067  } else if (depth == 16) {
4068  if (img_n != out_n) {
4069  cur[filter_bytes] = 255; // first pixel top byte
4070  cur[filter_bytes+1] = 255; // first pixel bottom byte
4071  }
4072  raw += filter_bytes;
4073  cur += output_bytes;
4074  prior += output_bytes;
4075  } else {
4076  raw += 1;
4077  cur += 1;
4078  prior += 1;
4079  }
4080 
4081  // this is a little gross, so that we don't switch per-pixel or per-component
4082  if (depth < 8 || img_n == out_n) {
4083  int nk = (width - 1)*filter_bytes;
4084  #define CASE(f) \
4085  case f: \
4086  for (k=0; k < nk; ++k)
4087  switch (filter) {
4088  // "none" filter turns into a memcpy here; make that explicit.
4089  case STBI__F_none: memcpy(cur, raw, nk); break;
4090  CASE(STBI__F_sub) cur[k] = STBI__BYTECAST(raw[k] + cur[k-filter_bytes]); break;
4091  CASE(STBI__F_up) cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4092  CASE(STBI__F_avg) cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-filter_bytes])>>1)); break;
4093  CASE(STBI__F_paeth) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],prior[k],prior[k-filter_bytes])); break;
4094  CASE(STBI__F_avg_first) cur[k] = STBI__BYTECAST(raw[k] + (cur[k-filter_bytes] >> 1)); break;
4095  CASE(STBI__F_paeth_first) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],0,0)); break;
4096  }
4097  #undef CASE
4098  raw += nk;
4099  } else {
4100  STBI_ASSERT(img_n+1 == out_n);
4101  #define CASE(f) \
4102  case f: \
4103  for (i=x-1; i >= 1; --i, cur[filter_bytes]=255,raw+=filter_bytes,cur+=output_bytes,prior+=output_bytes) \
4104  for (k=0; k < filter_bytes; ++k)
4105  switch (filter) {
4106  CASE(STBI__F_none) cur[k] = raw[k]; break;
4107  CASE(STBI__F_sub) cur[k] = STBI__BYTECAST(raw[k] + cur[k- output_bytes]); break;
4108  CASE(STBI__F_up) cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4109  CASE(STBI__F_avg) cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k- output_bytes])>>1)); break;
4110  CASE(STBI__F_paeth) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k- output_bytes],prior[k],prior[k- output_bytes])); break;
4111  CASE(STBI__F_avg_first) cur[k] = STBI__BYTECAST(raw[k] + (cur[k- output_bytes] >> 1)); break;
4112  CASE(STBI__F_paeth_first) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k- output_bytes],0,0)); break;
4113  }
4114  #undef CASE
4115 
4116  // the loop above sets the high byte of the pixels' alpha, but for
4117  // 16 bit png files we also need the low byte set. we'll do that here.
4118  if (depth == 16) {
4119  cur = a->out + stride*j; // start at the beginning of the row again
4120  for (i=0; i < x; ++i,cur+=output_bytes) {
4121  cur[filter_bytes+1] = 255;
4122  }
4123  }
4124  }
4125  }
4126 
4127  // we make a separate pass to expand bits to pixels; for performance,
4128  // this could run two scanlines behind the above code, so it won't
4129  // intefere with filtering but will still be in the cache.
4130  if (depth < 8) {
4131  for (j=0; j < y; ++j) {
4132  stbi_uc *cur = a->out + stride*j;
4133  stbi_uc *in = a->out + stride*j + x*out_n - img_width_bytes;
4134  // unpack 1/2/4-bit into a 8-bit buffer. allows us to keep the common 8-bit path optimal at minimal cost for 1/2/4-bit
4135  // png guarante byte alignment, if width is not multiple of 8/4/2 we'll decode dummy trailing data that will be skipped in the later loop
4136  stbi_uc scale = (color == 0) ? stbi__depth_scale_table[depth] : 1; // scale grayscale values to 0..255 range
4137 
4138  // note that the final byte might overshoot and write more data than desired.
4139  // we can allocate enough data that this never writes out of memory, but it
4140  // could also overwrite the next scanline. can it overwrite non-empty data
4141  // on the next scanline? yes, consider 1-pixel-wide scanlines with 1-bit-per-pixel.
4142  // so we need to explicitly clamp the final ones
4143 
4144  if (depth == 4) {
4145  for (k=x*img_n; k >= 2; k-=2, ++in) {
4146  *cur++ = scale * ((*in >> 4) );
4147  *cur++ = scale * ((*in ) & 0x0f);
4148  }
4149  if (k > 0) *cur++ = scale * ((*in >> 4) );
4150  } else if (depth == 2) {
4151  for (k=x*img_n; k >= 4; k-=4, ++in) {
4152  *cur++ = scale * ((*in >> 6) );
4153  *cur++ = scale * ((*in >> 4) & 0x03);
4154  *cur++ = scale * ((*in >> 2) & 0x03);
4155  *cur++ = scale * ((*in ) & 0x03);
4156  }
4157  if (k > 0) *cur++ = scale * ((*in >> 6) );
4158  if (k > 1) *cur++ = scale * ((*in >> 4) & 0x03);
4159  if (k > 2) *cur++ = scale * ((*in >> 2) & 0x03);
4160  } else if (depth == 1) {
4161  for (k=x*img_n; k >= 8; k-=8, ++in) {
4162  *cur++ = scale * ((*in >> 7) );
4163  *cur++ = scale * ((*in >> 6) & 0x01);
4164  *cur++ = scale * ((*in >> 5) & 0x01);
4165  *cur++ = scale * ((*in >> 4) & 0x01);
4166  *cur++ = scale * ((*in >> 3) & 0x01);
4167  *cur++ = scale * ((*in >> 2) & 0x01);
4168  *cur++ = scale * ((*in >> 1) & 0x01);
4169  *cur++ = scale * ((*in ) & 0x01);
4170  }
4171  if (k > 0) *cur++ = scale * ((*in >> 7) );
4172  if (k > 1) *cur++ = scale * ((*in >> 6) & 0x01);
4173  if (k > 2) *cur++ = scale * ((*in >> 5) & 0x01);
4174  if (k > 3) *cur++ = scale * ((*in >> 4) & 0x01);
4175  if (k > 4) *cur++ = scale * ((*in >> 3) & 0x01);
4176  if (k > 5) *cur++ = scale * ((*in >> 2) & 0x01);
4177  if (k > 6) *cur++ = scale * ((*in >> 1) & 0x01);
4178  }
4179  if (img_n != out_n) {
4180  int q;
4181  // insert alpha = 255
4182  cur = a->out + stride*j;
4183  if (img_n == 1) {
4184  for (q=x-1; q >= 0; --q) {
4185  cur[q*2+1] = 255;
4186  cur[q*2+0] = cur[q];
4187  }
4188  } else {
4189  STBI_ASSERT(img_n == 3);
4190  for (q=x-1; q >= 0; --q) {
4191  cur[q*4+3] = 255;
4192  cur[q*4+2] = cur[q*3+2];
4193  cur[q*4+1] = cur[q*3+1];
4194  cur[q*4+0] = cur[q*3+0];
4195  }
4196  }
4197  }
4198  }
4199  } else if (depth == 16) {
4200  // force the image data from big-endian to platform-native.
4201  // this is done in a separate pass due to the decoding relying
4202  // on the data being untouched, but could probably be done
4203  // per-line during decode if care is taken.
4204  stbi_uc *cur = a->out;
4205  stbi__uint16 *cur16 = (stbi__uint16*)cur;
4206 
4207  for(i=0; i < x*y*out_n; ++i,cur16++,cur+=2) {
4208  *cur16 = (cur[0] << 8) | cur[1];
4209  }
4210  }
4211 
4212  return 1;
4213 }
4214 
4215 static int stbi__create_png_image(stbi__png *a, stbi_uc *image_data, stbi__uint32 image_data_len, int out_n, int depth, int color, int interlaced)
4216 {
4217  stbi_uc *final;
4218  int p;
4219  if (!interlaced)
4220  return stbi__create_png_image_raw(a, image_data, image_data_len, out_n, a->s->img_x, a->s->img_y, depth, color);
4221 
4222  // de-interlacing
4223  final = (stbi_uc *) stbi__malloc(a->s->img_x * a->s->img_y * out_n);
4224  for (p=0; p < 7; ++p) {
4225  int xorig[] = { 0,4,0,2,0,1,0 };
4226  int yorig[] = { 0,0,4,0,2,0,1 };
4227  int xspc[] = { 8,8,4,4,2,2,1 };
4228  int yspc[] = { 8,8,8,4,4,2,2 };
4229  int i,j,x,y;
4230  // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
4231  x = (a->s->img_x - xorig[p] + xspc[p]-1) / xspc[p];
4232  y = (a->s->img_y - yorig[p] + yspc[p]-1) / yspc[p];
4233  if (x && y) {
4234  stbi__uint32 img_len = ((((a->s->img_n * x * depth) + 7) >> 3) + 1) * y;
4235  if (!stbi__create_png_image_raw(a, image_data, image_data_len, out_n, x, y, depth, color)) {
4236  STBI_FREE(final);
4237  return 0;
4238  }
4239  for (j=0; j < y; ++j) {
4240  for (i=0; i < x; ++i) {
4241  int out_y = j*yspc[p]+yorig[p];
4242  int out_x = i*xspc[p]+xorig[p];
4243  memcpy(final + out_y*a->s->img_x*out_n + out_x*out_n,
4244  a->out + (j*x+i)*out_n, out_n);
4245  }
4246  }
4247  STBI_FREE(a->out);
4248  image_data += img_len;
4249  image_data_len -= img_len;
4250  }
4251  }
4252  a->out = final;
4253 
4254  return 1;
4255 }
4256 
4257 static int stbi__compute_transparency(stbi__png *z, stbi_uc tc[3], int out_n)
4258 {
4259  stbi__context *s = z->s;
4260  stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4261  stbi_uc *p = z->out;
4262 
4263  // compute color-based transparency, assuming we've
4264  // already got 255 as the alpha value in the output
4265  STBI_ASSERT(out_n == 2 || out_n == 4);
4266 
4267  if (out_n == 2) {
4268  for (i=0; i < pixel_count; ++i) {
4269  p[1] = (p[0] == tc[0] ? 0 : 255);
4270  p += 2;
4271  }
4272  } else {
4273  for (i=0; i < pixel_count; ++i) {
4274  if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
4275  p[3] = 0;
4276  p += 4;
4277  }
4278  }
4279  return 1;
4280 }
4281 
4282 static int stbi__compute_transparency16(stbi__png *z, stbi__uint16 tc[3], int out_n)
4283 {
4284  stbi__context *s = z->s;
4285  stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4286  stbi__uint16 *p = (stbi__uint16*) z->out;
4287 
4288  // compute color-based transparency, assuming we've
4289  // already got 65535 as the alpha value in the output
4290  STBI_ASSERT(out_n == 2 || out_n == 4);
4291 
4292  if (out_n == 2) {
4293  for (i = 0; i < pixel_count; ++i) {
4294  p[1] = (p[0] == tc[0] ? 0 : 65535);
4295  p += 2;
4296  }
4297  } else {
4298  for (i = 0; i < pixel_count; ++i) {
4299  if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
4300  p[3] = 0;
4301  p += 4;
4302  }
4303  }
4304  return 1;
4305 }
4306 
4307 static int stbi__expand_png_palette(stbi__png *a, stbi_uc *palette, int len, int pal_img_n)
4308 {
4309  stbi__uint32 i, pixel_count = a->s->img_x * a->s->img_y;
4310  stbi_uc *p, *temp_out, *orig = a->out;
4311 
4312  p = (stbi_uc *) stbi__malloc(pixel_count * pal_img_n);
4313  if (p == NULL) return stbi__err("outofmem", "Out of memory");
4314 
4315  // between here and free(out) below, exitting would leak
4316  temp_out = p;
4317 
4318  if (pal_img_n == 3) {
4319  for (i=0; i < pixel_count; ++i) {
4320  int n = orig[i]*4;
4321  p[0] = palette[n ];
4322  p[1] = palette[n+1];
4323  p[2] = palette[n+2];
4324  p += 3;
4325  }
4326  } else {
4327  for (i=0; i < pixel_count; ++i) {
4328  int n = orig[i]*4;
4329  p[0] = palette[n ];
4330  p[1] = palette[n+1];
4331  p[2] = palette[n+2];
4332  p[3] = palette[n+3];
4333  p += 4;
4334  }
4335  }
4336  STBI_FREE(a->out);
4337  a->out = temp_out;
4338 
4339  STBI_NOTUSED(len);
4340 
4341  return 1;
4342 }
4343 
4344 static int stbi__reduce_png(stbi__png *p)
4345 {
4346  int i;
4347  int img_len = p->s->img_x * p->s->img_y * p->s->img_out_n;
4348  stbi_uc *reduced;
4349  stbi__uint16 *orig = (stbi__uint16*)p->out;
4350 
4351  if (p->depth != 16) return 1; // don't need to do anything if not 16-bit data
4352 
4353  reduced = (stbi_uc *)stbi__malloc(img_len);
4354  if (p == NULL) return stbi__err("outofmem", "Out of memory");
4355 
4356  for (i = 0; i < img_len; ++i) reduced[i] = (stbi_uc)((orig[i] >> 8) & 0xFF); // top half of each byte is a decent approx of 16->8 bit scaling
4357 
4358  p->out = reduced;
4359  STBI_FREE(orig);
4360 
4361  return 1;
4362 }
4363 
4364 static int stbi__unpremultiply_on_load = 0;
4365 static int stbi__de_iphone_flag = 0;
4366 
4367 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
4368 {
4369  stbi__unpremultiply_on_load = flag_true_if_should_unpremultiply;
4370 }
4371 
4372 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
4373 {
4374  stbi__de_iphone_flag = flag_true_if_should_convert;
4375 }
4376 
4377 static void stbi__de_iphone(stbi__png *z)
4378 {
4379  stbi__context *s = z->s;
4380  stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4381  stbi_uc *p = z->out;
4382 
4383  if (s->img_out_n == 3) { // convert bgr to rgb
4384  for (i=0; i < pixel_count; ++i) {
4385  stbi_uc t = p[0];
4386  p[0] = p[2];
4387  p[2] = t;
4388  p += 3;
4389  }
4390  } else {
4391  STBI_ASSERT(s->img_out_n == 4);
4392  if (stbi__unpremultiply_on_load) {
4393  // convert bgr to rgb and unpremultiply
4394  for (i=0; i < pixel_count; ++i) {
4395  stbi_uc a = p[3];
4396  stbi_uc t = p[0];
4397  if (a) {
4398  p[0] = p[2] * 255 / a;
4399  p[1] = p[1] * 255 / a;
4400  p[2] = t * 255 / a;
4401  } else {
4402  p[0] = p[2];
4403  p[2] = t;
4404  }
4405  p += 4;
4406  }
4407  } else {
4408  // convert bgr to rgb
4409  for (i=0; i < pixel_count; ++i) {
4410  stbi_uc t = p[0];
4411  p[0] = p[2];
4412  p[2] = t;
4413  p += 4;
4414  }
4415  }
4416  }
4417 }
4418 
4419 #define STBI__PNG_TYPE(a,b,c,d) (((a) << 24) + ((b) << 16) + ((c) << 8) + (d))
4420 
4421 static int stbi__parse_png_file(stbi__png *z, int scan, int req_comp)
4422 {
4423  stbi_uc palette[1024], pal_img_n=0;
4424  stbi_uc has_trans=0, tc[3];
4425  stbi__uint16 tc16[3];
4426  stbi__uint32 ioff=0, idata_limit=0, i, pal_len=0;
4427  int first=1,k,interlace=0, color=0, is_iphone=0;
4428  stbi__context *s = z->s;
4429 
4430  z->expanded = NULL;
4431  z->idata = NULL;
4432  z->out = NULL;
4433 
4434  if (!stbi__check_png_header(s)) return 0;
4435 
4436  if (scan == STBI__SCAN_type) return 1;
4437 
4438  for (;;) {
4439  stbi__pngchunk c = stbi__get_chunk_header(s);
4440  switch (c.type) {
4441  case STBI__PNG_TYPE('C','g','B','I'):
4442  is_iphone = 1;
4443  stbi__skip(s, c.length);
4444  break;
4445  case STBI__PNG_TYPE('I','H','D','R'): {
4446  int comp,filter;
4447  if (!first) return stbi__err("multiple IHDR","Corrupt PNG");
4448  first = 0;
4449  if (c.length != 13) return stbi__err("bad IHDR len","Corrupt PNG");
4450  s->img_x = stbi__get32be(s); if (s->img_x > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4451  s->img_y = stbi__get32be(s); if (s->img_y > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4452  z->depth = stbi__get8(s); if (z->depth != 1 && z->depth != 2 && z->depth != 4 && z->depth != 8 && z->depth != 16) return stbi__err("1/2/4/8/16-bit only","PNG not supported: 1/2/4/8/16-bit only");
4453  color = stbi__get8(s); if (color > 6) return stbi__err("bad ctype","Corrupt PNG");
4454  if (color == 3 && z->depth == 16) return stbi__err("bad ctype","Corrupt PNG");
4455  if (color == 3) pal_img_n = 3; else if (color & 1) return stbi__err("bad ctype","Corrupt PNG");
4456  comp = stbi__get8(s); if (comp) return stbi__err("bad comp method","Corrupt PNG");
4457  filter= stbi__get8(s); if (filter) return stbi__err("bad filter method","Corrupt PNG");
4458  interlace = stbi__get8(s); if (interlace>1) return stbi__err("bad interlace method","Corrupt PNG");
4459  if (!s->img_x || !s->img_y) return stbi__err("0-pixel image","Corrupt PNG");
4460  if (!pal_img_n) {
4461  s->img_n = (color & 2 ? 3 : 1) + (color & 4 ? 1 : 0);
4462  if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
4463  if (scan == STBI__SCAN_header) return 1;
4464  } else {
4465  // if paletted, then pal_n is our final components, and
4466  // img_n is # components to decompress/filter.
4467  s->img_n = 1;
4468  if ((1 << 30) / s->img_x / 4 < s->img_y) return stbi__err("too large","Corrupt PNG");
4469  // if SCAN_header, have to scan to see if we have a tRNS
4470  }
4471  break;
4472  }
4473 
4474  case STBI__PNG_TYPE('P','L','T','E'): {
4475  if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4476  if (c.length > 256*3) return stbi__err("invalid PLTE","Corrupt PNG");
4477  pal_len = c.length / 3;
4478  if (pal_len * 3 != c.length) return stbi__err("invalid PLTE","Corrupt PNG");
4479  for (i=0; i < pal_len; ++i) {
4480  palette[i*4+0] = stbi__get8(s);
4481  palette[i*4+1] = stbi__get8(s);
4482  palette[i*4+2] = stbi__get8(s);
4483  palette[i*4+3] = 255;
4484  }
4485  break;
4486  }
4487 
4488  case STBI__PNG_TYPE('t','R','N','S'): {
4489  if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4490  if (z->idata) return stbi__err("tRNS after IDAT","Corrupt PNG");
4491  if (pal_img_n) {
4492  if (scan == STBI__SCAN_header) { s->img_n = 4; return 1; }
4493  if (pal_len == 0) return stbi__err("tRNS before PLTE","Corrupt PNG");
4494  if (c.length > pal_len) return stbi__err("bad tRNS len","Corrupt PNG");
4495  pal_img_n = 4;
4496  for (i=0; i < c.length; ++i)
4497  palette[i*4+3] = stbi__get8(s);
4498  } else {
4499  if (!(s->img_n & 1)) return stbi__err("tRNS with alpha","Corrupt PNG");
4500  if (c.length != (stbi__uint32) s->img_n*2) return stbi__err("bad tRNS len","Corrupt PNG");
4501  has_trans = 1;
4502  if (z->depth == 16) {
4503  for (k = 0; k < s->img_n; ++k) tc16[k] = stbi__get16be(s); // copy the values as-is
4504  } else {
4505  for (k = 0; k < s->img_n; ++k) tc[k] = (stbi_uc)(stbi__get16be(s) & 255) * stbi__depth_scale_table[z->depth]; // non 8-bit images will be larger
4506  }
4507  }
4508  break;
4509  }
4510 
4511  case STBI__PNG_TYPE('I','D','A','T'): {
4512  if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4513  if (pal_img_n && !pal_len) return stbi__err("no PLTE","Corrupt PNG");
4514  if (scan == STBI__SCAN_header) { s->img_n = pal_img_n; return 1; }
4515  if ((int)(ioff + c.length) < (int)ioff) return 0;
4516  if (ioff + c.length > idata_limit) {
4517  stbi__uint32 idata_limit_old = idata_limit;
4518  stbi_uc *p;
4519  if (idata_limit == 0) idata_limit = c.length > 4096 ? c.length : 4096;
4520  while (ioff + c.length > idata_limit)
4521  idata_limit *= 2;
4522  STBI_NOTUSED(idata_limit_old);
4523  p = (stbi_uc *) STBI_REALLOC_SIZED(z->idata, idata_limit_old, idata_limit); if (p == NULL) return stbi__err("outofmem", "Out of memory");
4524  z->idata = p;
4525  }
4526  if (!stbi__getn(s, z->idata+ioff,c.length)) return stbi__err("outofdata","Corrupt PNG");
4527  ioff += c.length;
4528  break;
4529  }
4530 
4531  case STBI__PNG_TYPE('I','E','N','D'): {
4532  stbi__uint32 raw_len, bpl;
4533  if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4534  if (scan != STBI__SCAN_load) return 1;
4535  if (z->idata == NULL) return stbi__err("no IDAT","Corrupt PNG");
4536  // initial guess for decoded data size to avoid unnecessary reallocs
4537  bpl = (s->img_x * z->depth + 7) / 8; // bytes per line, per component
4538  raw_len = bpl * s->img_y * s->img_n /* pixels */ + s->img_y /* filter mode per row */;
4539  z->expanded = (stbi_uc *) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z->idata, ioff, raw_len, (int *) &raw_len, !is_iphone);
4540  if (z->expanded == NULL) return 0; // zlib should set error
4541  STBI_FREE(z->idata); z->idata = NULL;
4542  if ((req_comp == s->img_n+1 && req_comp != 3 && !pal_img_n) || has_trans)
4543  s->img_out_n = s->img_n+1;
4544  else
4545  s->img_out_n = s->img_n;
4546  if (!stbi__create_png_image(z, z->expanded, raw_len, s->img_out_n, z->depth, color, interlace)) return 0;
4547  if (has_trans) {
4548  if (z->depth == 16) {
4549  if (!stbi__compute_transparency16(z, tc16, s->img_out_n)) return 0;
4550  } else {
4551  if (!stbi__compute_transparency(z, tc, s->img_out_n)) return 0;
4552  }
4553  }
4554  if (is_iphone && stbi__de_iphone_flag && s->img_out_n > 2)
4555  stbi__de_iphone(z);
4556  if (pal_img_n) {
4557  // pal_img_n == 3 or 4
4558  s->img_n = pal_img_n; // record the actual colors we had
4559  s->img_out_n = pal_img_n;
4560  if (req_comp >= 3) s->img_out_n = req_comp;
4561  if (!stbi__expand_png_palette(z, palette, pal_len, s->img_out_n))
4562  return 0;
4563  }
4564  STBI_FREE(z->expanded); z->expanded = NULL;
4565  return 1;
4566  }
4567 
4568  default:
4569  // if critical, fail
4570  if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4571  if ((c.type & (1 << 29)) == 0) {
4572  #ifndef STBI_NO_FAILURE_STRINGS
4573  // not threadsafe
4574  static char invalid_chunk[] = "XXXX PNG chunk not known";
4575  invalid_chunk[0] = STBI__BYTECAST(c.type >> 24);
4576  invalid_chunk[1] = STBI__BYTECAST(c.type >> 16);
4577  invalid_chunk[2] = STBI__BYTECAST(c.type >> 8);
4578  invalid_chunk[3] = STBI__BYTECAST(c.type >> 0);
4579  #endif
4580  return stbi__err(invalid_chunk, "PNG not supported: unknown PNG chunk type");
4581  }
4582  stbi__skip(s, c.length);
4583  break;
4584  }
4585  // end of PNG chunk, read and skip CRC
4586  stbi__get32be(s);
4587  }
4588 }
4589 
4590 static unsigned char *stbi__do_png(stbi__png *p, int *x, int *y, int *n, int req_comp)
4591 {
4592  unsigned char *result=NULL;
4593  if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
4594  if (stbi__parse_png_file(p, STBI__SCAN_load, req_comp)) {
4595  if (p->depth == 16) {
4596  if (!stbi__reduce_png(p)) {
4597  return result;
4598  }
4599  }
4600  result = p->out;
4601  p->out = NULL;
4602  if (req_comp && req_comp != p->s->img_out_n) {
4603  result = stbi__convert_format(result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
4604  p->s->img_out_n = req_comp;
4605  if (result == NULL) return result;
4606  }
4607  *x = p->s->img_x;
4608  *y = p->s->img_y;
4609  if (n) *n = p->s->img_n;
4610  }
4611  STBI_FREE(p->out); p->out = NULL;
4612  STBI_FREE(p->expanded); p->expanded = NULL;
4613  STBI_FREE(p->idata); p->idata = NULL;
4614 
4615  return result;
4616 }
4617 
4618 static unsigned char *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4619 {
4620  stbi__png p;
4621  p.s = s;
4622  return stbi__do_png(&p, x,y,comp,req_comp);
4623 }
4624 
4625 static int stbi__png_test(stbi__context *s)
4626 {
4627  int r;
4628  r = stbi__check_png_header(s);
4629  stbi__rewind(s);
4630  return r;
4631 }
4632 
4633 static int stbi__png_info_raw(stbi__png *p, int *x, int *y, int *comp)
4634 {
4635  if (!stbi__parse_png_file(p, STBI__SCAN_header, 0)) {
4636  stbi__rewind( p->s );
4637  return 0;
4638  }
4639  if (x) *x = p->s->img_x;
4640  if (y) *y = p->s->img_y;
4641  if (comp) *comp = p->s->img_n;
4642  return 1;
4643 }
4644 
4645 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp)
4646 {
4647  stbi__png p;
4648  p.s = s;
4649  return stbi__png_info_raw(&p, x, y, comp);
4650 }
4651 #endif
4652 
4653 // Microsoft/Windows BMP image
4654 
4655 #ifndef STBI_NO_BMP
4656 static int stbi__bmp_test_raw(stbi__context *s)
4657 {
4658  int r;
4659  int sz;
4660  if (stbi__get8(s) != 'B') return 0;
4661  if (stbi__get8(s) != 'M') return 0;
4662  stbi__get32le(s); // discard filesize
4663  stbi__get16le(s); // discard reserved
4664  stbi__get16le(s); // discard reserved
4665  stbi__get32le(s); // discard data offset
4666  sz = stbi__get32le(s);
4667  r = (sz == 12 || sz == 40 || sz == 56 || sz == 108 || sz == 124);
4668  return r;
4669 }
4670 
4671 static int stbi__bmp_test(stbi__context *s)
4672 {
4673  int r = stbi__bmp_test_raw(s);
4674  stbi__rewind(s);
4675  return r;
4676 }
4677 
4678 
4679 // returns 0..31 for the highest set bit
4680 static int stbi__high_bit(unsigned int z)
4681 {
4682  int n=0;
4683  if (z == 0) return -1;
4684  if (z >= 0x10000) n += 16, z >>= 16;
4685  if (z >= 0x00100) n += 8, z >>= 8;
4686  if (z >= 0x00010) n += 4, z >>= 4;
4687  if (z >= 0x00004) n += 2, z >>= 2;
4688  if (z >= 0x00002) n += 1, z >>= 1;
4689  return n;
4690 }
4691 
4692 static int stbi__bitcount(unsigned int a)
4693 {
4694  a = (a & 0x55555555) + ((a >> 1) & 0x55555555); // max 2
4695  a = (a & 0x33333333) + ((a >> 2) & 0x33333333); // max 4
4696  a = (a + (a >> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
4697  a = (a + (a >> 8)); // max 16 per 8 bits
4698  a = (a + (a >> 16)); // max 32 per 8 bits
4699  return a & 0xff;
4700 }
4701 
4702 static int stbi__shiftsigned(int v, int shift, int bits)
4703 {
4704  int result;
4705  int z=0;
4706 
4707  if (shift < 0) v <<= -shift;
4708  else v >>= shift;
4709  result = v;
4710 
4711  z = bits;
4712  while (z < 8) {
4713  result += v >> z;
4714  z += bits;
4715  }
4716  return result;
4717 }
4718 
4719 typedef struct
4720 {
4721  int bpp, offset, hsz;
4722  unsigned int mr,mg,mb,ma, all_a;
4723 } stbi__bmp_data;
4724 
4725 static void *stbi__bmp_parse_header(stbi__context *s, stbi__bmp_data *info)
4726 {
4727  int hsz;
4728  if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP");
4729  stbi__get32le(s); // discard filesize
4730  stbi__get16le(s); // discard reserved
4731  stbi__get16le(s); // discard reserved
4732  info->offset = stbi__get32le(s);
4733  info->hsz = hsz = stbi__get32le(s);
4734  info->mr = info->mg = info->mb = info->ma = 0;
4735 
4736  if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown");
4737  if (hsz == 12) {
4738  s->img_x = stbi__get16le(s);
4739  s->img_y = stbi__get16le(s);
4740  } else {
4741  s->img_x = stbi__get32le(s);
4742  s->img_y = stbi__get32le(s);
4743  }
4744  if (stbi__get16le(s) != 1) return stbi__errpuc("bad BMP", "bad BMP");
4745  info->bpp = stbi__get16le(s);
4746  if (info->bpp == 1) return stbi__errpuc("monochrome", "BMP type not supported: 1-bit");
4747  if (hsz != 12) {
4748  int compress = stbi__get32le(s);
4749  if (compress == 1 || compress == 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE");
4750  stbi__get32le(s); // discard sizeof
4751  stbi__get32le(s); // discard hres
4752  stbi__get32le(s); // discard vres
4753  stbi__get32le(s); // discard colorsused
4754  stbi__get32le(s); // discard max important
4755  if (hsz == 40 || hsz == 56) {
4756  if (hsz == 56) {
4757  stbi__get32le(s);
4758  stbi__get32le(s);
4759  stbi__get32le(s);
4760  stbi__get32le(s);
4761  }
4762  if (info->bpp == 16 || info->bpp == 32) {
4763  if (compress == 0) {
4764  if (info->bpp == 32) {
4765  info->mr = 0xffu << 16;
4766  info->mg = 0xffu << 8;
4767  info->mb = 0xffu << 0;
4768  info->ma = 0xffu << 24;
4769  info->all_a = 0; // if all_a is 0 at end, then we loaded alpha channel but it was all 0
4770  } else {
4771  info->mr = 31u << 10;
4772  info->mg = 31u << 5;
4773  info->mb = 31u << 0;
4774  }
4775  } else if (compress == 3) {
4776  info->mr = stbi__get32le(s);
4777  info->mg = stbi__get32le(s);
4778  info->mb = stbi__get32le(s);
4779  // not documented, but generated by photoshop and handled by mspaint
4780  if (info->mr == info->mg && info->mg == info->mb) {
4781  // ?!?!?
4782  return stbi__errpuc("bad BMP", "bad BMP");
4783  }
4784  } else
4785  return stbi__errpuc("bad BMP", "bad BMP");
4786  }
4787  } else {
4788  int i;
4789  if (hsz != 108 && hsz != 124)
4790  return stbi__errpuc("bad BMP", "bad BMP");
4791  info->mr = stbi__get32le(s);
4792  info->mg = stbi__get32le(s);
4793  info->mb = stbi__get32le(s);
4794  info->ma = stbi__get32le(s);
4795  stbi__get32le(s); // discard color space
4796  for (i=0; i < 12; ++i)
4797  stbi__get32le(s); // discard color space parameters
4798  if (hsz == 124) {
4799  stbi__get32le(s); // discard rendering intent
4800  stbi__get32le(s); // discard offset of profile data
4801  stbi__get32le(s); // discard size of profile data
4802  stbi__get32le(s); // discard reserved
4803  }
4804  }
4805  }
4806  return (void *) 1;
4807 }
4808 
4809 
4810 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4811 {
4812  stbi_uc *out;
4813  unsigned int mr=0,mg=0,mb=0,ma=0, all_a;
4814  stbi_uc pal[256][4];
4815  int psize=0,i,j,width;
4816  int flip_vertically, pad, target;
4817  stbi__bmp_data info;
4818 
4819  info.all_a = 255;
4820  if (stbi__bmp_parse_header(s, &info) == NULL)
4821  return NULL; // error code already set
4822 
4823  flip_vertically = ((int) s->img_y) > 0;
4824  s->img_y = abs((int) s->img_y);
4825 
4826  mr = info.mr;
4827  mg = info.mg;
4828  mb = info.mb;
4829  ma = info.ma;
4830  all_a = info.all_a;
4831 
4832  if (info.hsz == 12) {
4833  if (info.bpp < 24)
4834  psize = (info.offset - 14 - 24) / 3;
4835  } else {
4836  if (info.bpp < 16)
4837  psize = (info.offset - 14 - info.hsz) >> 2;
4838  }
4839 
4840  s->img_n = ma ? 4 : 3;
4841  if (req_comp && req_comp >= 3) // we can directly decode 3 or 4
4842  target = req_comp;
4843  else
4844  target = s->img_n; // if they want monochrome, we'll post-convert
4845 
4846  out = (stbi_uc *) stbi__malloc(target * s->img_x * s->img_y);
4847  if (!out) return stbi__errpuc("outofmem", "Out of memory");
4848  if (info.bpp < 16) {
4849  int z=0;
4850  if (psize == 0 || psize > 256) { STBI_FREE(out); return stbi__errpuc("invalid", "Corrupt BMP"); }
4851  for (i=0; i < psize; ++i) {
4852  pal[i][2] = stbi__get8(s);
4853  pal[i][1] = stbi__get8(s);
4854  pal[i][0] = stbi__get8(s);
4855  if (info.hsz != 12) stbi__get8(s);
4856  pal[i][3] = 255;
4857  }
4858  stbi__skip(s, info.offset - 14 - info.hsz - psize * (info.hsz == 12 ? 3 : 4));
4859  if (info.bpp == 4) width = (s->img_x + 1) >> 1;
4860  else if (info.bpp == 8) width = s->img_x;
4861  else { STBI_FREE(out); return stbi__errpuc("bad bpp", "Corrupt BMP"); }
4862  pad = (-width)&3;
4863  for (j=0; j < (int) s->img_y; ++j) {
4864  for (i=0; i < (int) s->img_x; i += 2) {
4865  int v=stbi__get8(s),v2=0;
4866  if (info.bpp == 4) {
4867  v2 = v & 15;
4868  v >>= 4;
4869  }
4870  out[z++] = pal[v][0];
4871  out[z++] = pal[v][1];
4872  out[z++] = pal[v][2];
4873  if (target == 4) out[z++] = 255;
4874  if (i+1 == (int) s->img_x) break;
4875  v = (info.bpp == 8) ? stbi__get8(s) : v2;
4876  out[z++] = pal[v][0];
4877  out[z++] = pal[v][1];
4878  out[z++] = pal[v][2];
4879  if (target == 4) out[z++] = 255;
4880  }
4881  stbi__skip(s, pad);
4882  }
4883  } else {
4884  int rshift=0,gshift=0,bshift=0,ashift=0,rcount=0,gcount=0,bcount=0,acount=0;
4885  int z = 0;
4886  int easy=0;
4887  stbi__skip(s, info.offset - 14 - info.hsz);
4888  if (info.bpp == 24) width = 3 * s->img_x;
4889  else if (info.bpp == 16) width = 2*s->img_x;
4890  else /* bpp = 32 and pad = 0 */ width=0;
4891  pad = (-width) & 3;
4892  if (info.bpp == 24) {
4893  easy = 1;
4894  } else if (info.bpp == 32) {
4895  if (mb == 0xff && mg == 0xff00 && mr == 0x00ff0000 && ma == 0xff000000)
4896  easy = 2;
4897  }
4898  if (!easy) {
4899  if (!mr || !mg || !mb) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
4900  // right shift amt to put high bit in position #7
4901  rshift = stbi__high_bit(mr)-7; rcount = stbi__bitcount(mr);
4902  gshift = stbi__high_bit(mg)-7; gcount = stbi__bitcount(mg);
4903  bshift = stbi__high_bit(mb)-7; bcount = stbi__bitcount(mb);
4904  ashift = stbi__high_bit(ma)-7; acount = stbi__bitcount(ma);
4905  }
4906  for (j=0; j < (int) s->img_y; ++j) {
4907  if (easy) {
4908  for (i=0; i < (int) s->img_x; ++i) {
4909  unsigned char a;
4910  out[z+2] = stbi__get8(s);
4911  out[z+1] = stbi__get8(s);
4912  out[z+0] = stbi__get8(s);
4913  z += 3;
4914  a = (easy == 2 ? stbi__get8(s) : 255);
4915  all_a |= a;
4916  if (target == 4) out[z++] = a;
4917  }
4918  } else {
4919  int bpp = info.bpp;
4920  for (i=0; i < (int) s->img_x; ++i) {
4921  stbi__uint32 v = (bpp == 16 ? (stbi__uint32) stbi__get16le(s) : stbi__get32le(s));
4922  int a;
4923  out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mr, rshift, rcount));
4924  out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mg, gshift, gcount));
4925  out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mb, bshift, bcount));
4926  a = (ma ? stbi__shiftsigned(v & ma, ashift, acount) : 255);
4927  all_a |= a;
4928  if (target == 4) out[z++] = STBI__BYTECAST(a);
4929  }
4930  }
4931  stbi__skip(s, pad);
4932  }
4933  }
4934 
4935  // if alpha channel is all 0s, replace with all 255s
4936  if (target == 4 && all_a == 0)
4937  for (i=4*s->img_x*s->img_y-1; i >= 0; i -= 4)
4938  out[i] = 255;
4939 
4940  if (flip_vertically) {
4941  stbi_uc t;
4942  for (j=0; j < (int) s->img_y>>1; ++j) {
4943  stbi_uc *p1 = out + j *s->img_x*target;
4944  stbi_uc *p2 = out + (s->img_y-1-j)*s->img_x*target;
4945  for (i=0; i < (int) s->img_x*target; ++i) {
4946  t = p1[i], p1[i] = p2[i], p2[i] = t;
4947  }
4948  }
4949  }
4950 
4951  if (req_comp && req_comp != target) {
4952  out = stbi__convert_format(out, target, req_comp, s->img_x, s->img_y);
4953  if (out == NULL) return out; // stbi__convert_format frees input on failure
4954  }
4955 
4956  *x = s->img_x;
4957  *y = s->img_y;
4958  if (comp) *comp = s->img_n;
4959  return out;
4960 }
4961 #endif
4962 
4963 // Targa Truevision - TGA
4964 // by Jonathan Dummer
4965 #ifndef STBI_NO_TGA
4966 // returns STBI_rgb or whatever, 0 on error
4967 static int stbi__tga_get_comp(int bits_per_pixel, int is_grey, int* is_rgb16)
4968 {
4969  // only RGB or RGBA (incl. 16bit) or grey allowed
4970  if(is_rgb16) *is_rgb16 = 0;
4971  switch(bits_per_pixel) {
4972  case 8: return STBI_grey;
4973  case 16: if(is_grey) return STBI_grey_alpha;
4974  // else: fall-through
4975  case 15: if(is_rgb16) *is_rgb16 = 1;
4976  return STBI_rgb;
4977  case 24: // fall-through
4978  case 32: return bits_per_pixel/8;
4979  default: return 0;
4980  }
4981 }
4982 
4983 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp)
4984 {
4985  int tga_w, tga_h, tga_comp, tga_image_type, tga_bits_per_pixel, tga_colormap_bpp;
4986  int sz, tga_colormap_type;
4987  stbi__get8(s); // discard Offset
4988  tga_colormap_type = stbi__get8(s); // colormap type
4989  if( tga_colormap_type > 1 ) {
4990  stbi__rewind(s);
4991  return 0; // only RGB or indexed allowed
4992  }
4993  tga_image_type = stbi__get8(s); // image type
4994  if ( tga_colormap_type == 1 ) { // colormapped (paletted) image
4995  if (tga_image_type != 1 && tga_image_type != 9) {
4996  stbi__rewind(s);
4997  return 0;
4998  }
4999  stbi__skip(s,4); // skip index of first colormap entry and number of entries
5000  sz = stbi__get8(s); // check bits per palette color entry
5001  if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) {
5002  stbi__rewind(s);
5003  return 0;
5004  }
5005  stbi__skip(s,4); // skip image x and y origin
5006  tga_colormap_bpp = sz;
5007  } else { // "normal" image w/o colormap - only RGB or grey allowed, +/- RLE
5008  if ( (tga_image_type != 2) && (tga_image_type != 3) && (tga_image_type != 10) && (tga_image_type != 11) ) {
5009  stbi__rewind(s);
5010  return 0; // only RGB or grey allowed, +/- RLE
5011  }
5012  stbi__skip(s,9); // skip colormap specification and image x/y origin
5013  tga_colormap_bpp = 0;
5014  }
5015  tga_w = stbi__get16le(s);
5016  if( tga_w < 1 ) {
5017  stbi__rewind(s);
5018  return 0; // test width
5019  }
5020  tga_h = stbi__get16le(s);
5021  if( tga_h < 1 ) {
5022  stbi__rewind(s);
5023  return 0; // test height
5024  }
5025  tga_bits_per_pixel = stbi__get8(s); // bits per pixel
5026  stbi__get8(s); // ignore alpha bits
5027  if (tga_colormap_bpp != 0) {
5028  if((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16)) {
5029  // when using a colormap, tga_bits_per_pixel is the size of the indexes
5030  // I don't think anything but 8 or 16bit indexes makes sense
5031  stbi__rewind(s);
5032  return 0;
5033  }
5034  tga_comp = stbi__tga_get_comp(tga_colormap_bpp, 0, NULL);
5035  } else {
5036  tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3) || (tga_image_type == 11), NULL);
5037  }
5038  if(!tga_comp) {
5039  stbi__rewind(s);
5040  return 0;
5041  }
5042  if (x) *x = tga_w;
5043  if (y) *y = tga_h;
5044  if (comp) *comp = tga_comp;
5045  return 1; // seems to have passed everything
5046 }
5047 
5048 static int stbi__tga_test(stbi__context *s)
5049 {
5050  int res = 0;
5051  int sz, tga_color_type;
5052  stbi__get8(s); // discard Offset
5053  tga_color_type = stbi__get8(s); // color type
5054  if ( tga_color_type > 1 ) goto errorEnd; // only RGB or indexed allowed
5055  sz = stbi__get8(s); // image type
5056  if ( tga_color_type == 1 ) { // colormapped (paletted) image
5057  if (sz != 1 && sz != 9) goto errorEnd; // colortype 1 demands image type 1 or 9
5058  stbi__skip(s,4); // skip index of first colormap entry and number of entries
5059  sz = stbi__get8(s); // check bits per palette color entry
5060  if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
5061  stbi__skip(s,4); // skip image x and y origin
5062  } else { // "normal" image w/o colormap
5063  if ( (sz != 2) && (sz != 3) && (sz != 10) && (sz != 11) ) goto errorEnd; // only RGB or grey allowed, +/- RLE
5064  stbi__skip(s,9); // skip colormap specification and image x/y origin
5065  }
5066  if ( stbi__get16le(s) < 1 ) goto errorEnd; // test width
5067  if ( stbi__get16le(s) < 1 ) goto errorEnd; // test height
5068  sz = stbi__get8(s); // bits per pixel
5069  if ( (tga_color_type == 1) && (sz != 8) && (sz != 16) ) goto errorEnd; // for colormapped images, bpp is size of an index
5070  if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
5071 
5072  res = 1; // if we got this far, everything's good and we can return 1 instead of 0
5073 
5074 errorEnd:
5075  stbi__rewind(s);
5076  return res;
5077 }
5078 
5079 // read 16bit value and convert to 24bit RGB
5080 void stbi__tga_read_rgb16(stbi__context *s, stbi_uc* out)
5081 {
5082  stbi__uint16 px = stbi__get16le(s);
5083  stbi__uint16 fiveBitMask = 31;
5084  // we have 3 channels with 5bits each
5085  int r = (px >> 10) & fiveBitMask;
5086  int g = (px >> 5) & fiveBitMask;
5087  int b = px & fiveBitMask;
5088  // Note that this saves the data in RGB(A) order, so it doesn't need to be swapped later
5089  out[0] = (r * 255)/31;
5090  out[1] = (g * 255)/31;
5091  out[2] = (b * 255)/31;
5092 
5093  // some people claim that the most significant bit might be used for alpha
5094  // (possibly if an alpha-bit is set in the "image descriptor byte")
5095  // but that only made 16bit test images completely translucent..
5096  // so let's treat all 15 and 16bit TGAs as RGB with no alpha.
5097 }
5098 
5099 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5100 {
5101  // read in the TGA header stuff
5102  int tga_offset = stbi__get8(s);
5103  int tga_indexed = stbi__get8(s);
5104  int tga_image_type = stbi__get8(s);
5105  int tga_is_RLE = 0;
5106  int tga_palette_start = stbi__get16le(s);
5107  int tga_palette_len = stbi__get16le(s);
5108  int tga_palette_bits = stbi__get8(s);
5109  int tga_x_origin = stbi__get16le(s);
5110  int tga_y_origin = stbi__get16le(s);
5111  int tga_width = stbi__get16le(s);
5112  int tga_height = stbi__get16le(s);
5113  int tga_bits_per_pixel = stbi__get8(s);
5114  int tga_comp, tga_rgb16=0;
5115  int tga_inverted = stbi__get8(s);
5116  // int tga_alpha_bits = tga_inverted & 15; // the 4 lowest bits - unused (useless?)
5117  // image data
5118  unsigned char *tga_data;
5119  unsigned char *tga_palette = NULL;
5120  int i, j;
5121  unsigned char raw_data[4];
5122  int RLE_count = 0;
5123  int RLE_repeating = 0;
5124  int read_next_pixel = 1;
5125 
5126  // do a tiny bit of precessing
5127  if ( tga_image_type >= 8 )
5128  {
5129  tga_image_type -= 8;
5130  tga_is_RLE = 1;
5131  }
5132  tga_inverted = 1 - ((tga_inverted >> 5) & 1);
5133 
5134  // If I'm paletted, then I'll use the number of bits from the palette
5135  if ( tga_indexed ) tga_comp = stbi__tga_get_comp(tga_palette_bits, 0, &tga_rgb16);
5136  else tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3), &tga_rgb16);
5137 
5138  if(!tga_comp) // shouldn't really happen, stbi__tga_test() should have ensured basic consistency
5139  return stbi__errpuc("bad format", "Can't find out TGA pixelformat");
5140 
5141  // tga info
5142  *x = tga_width;
5143  *y = tga_height;
5144  if (comp) *comp = tga_comp;
5145 
5146  tga_data = (unsigned char*)stbi__malloc( (size_t)tga_width * tga_height * tga_comp );
5147  if (!tga_data) return stbi__errpuc("outofmem", "Out of memory");
5148 
5149  // skip to the data's starting position (offset usually = 0)
5150  stbi__skip(s, tga_offset );
5151 
5152  if ( !tga_indexed && !tga_is_RLE && !tga_rgb16 ) {
5153  for (i=0; i < tga_height; ++i) {
5154  int row = tga_inverted ? tga_height -i - 1 : i;
5155  stbi_uc *tga_row = tga_data + row*tga_width*tga_comp;
5156  stbi__getn(s, tga_row, tga_width * tga_comp);
5157  }
5158  } else {
5159  // do I need to load a palette?
5160  if ( tga_indexed)
5161  {
5162  // any data to skip? (offset usually = 0)
5163  stbi__skip(s, tga_palette_start );
5164  // load the palette
5165  tga_palette = (unsigned char*)stbi__malloc( tga_palette_len * tga_comp );
5166  if (!tga_palette) {
5167  STBI_FREE(tga_data);
5168  return stbi__errpuc("outofmem", "Out of memory");
5169  }
5170  if (tga_rgb16) {
5171  stbi_uc *pal_entry = tga_palette;
5172  STBI_ASSERT(tga_comp == STBI_rgb);
5173  for (i=0; i < tga_palette_len; ++i) {
5174  stbi__tga_read_rgb16(s, pal_entry);
5175  pal_entry += tga_comp;
5176  }
5177  } else if (!stbi__getn(s, tga_palette, tga_palette_len * tga_comp)) {
5178  STBI_FREE(tga_data);
5179  STBI_FREE(tga_palette);
5180  return stbi__errpuc("bad palette", "Corrupt TGA");
5181  }
5182  }
5183  // load the data
5184  for (i=0; i < tga_width * tga_height; ++i)
5185  {
5186  // if I'm in RLE mode, do I need to get a RLE stbi__pngchunk?
5187  if ( tga_is_RLE )
5188  {
5189  if ( RLE_count == 0 )
5190  {
5191  // yep, get the next byte as a RLE command
5192  int RLE_cmd = stbi__get8(s);
5193  RLE_count = 1 + (RLE_cmd & 127);
5194  RLE_repeating = RLE_cmd >> 7;
5195  read_next_pixel = 1;
5196  } else if ( !RLE_repeating )
5197  {
5198  read_next_pixel = 1;
5199  }
5200  } else
5201  {
5202  read_next_pixel = 1;
5203  }
5204  // OK, if I need to read a pixel, do it now
5205  if ( read_next_pixel )
5206  {
5207  // load however much data we did have
5208  if ( tga_indexed )
5209  {
5210  // read in index, then perform the lookup
5211  int pal_idx = (tga_bits_per_pixel == 8) ? stbi__get8(s) : stbi__get16le(s);
5212  if ( pal_idx >= tga_palette_len ) {
5213  // invalid index
5214  pal_idx = 0;
5215  }
5216  pal_idx *= tga_comp;
5217  for (j = 0; j < tga_comp; ++j) {
5218  raw_data[j] = tga_palette[pal_idx+j];
5219  }
5220  } else if(tga_rgb16) {
5221  STBI_ASSERT(tga_comp == STBI_rgb);
5222  stbi__tga_read_rgb16(s, raw_data);
5223  } else {
5224  // read in the data raw
5225  for (j = 0; j < tga_comp; ++j) {
5226  raw_data[j] = stbi__get8(s);
5227  }
5228  }
5229  // clear the reading flag for the next pixel
5230  read_next_pixel = 0;
5231  } // end of reading a pixel
5232 
5233  // copy data
5234  for (j = 0; j < tga_comp; ++j)
5235  tga_data[i*tga_comp+j] = raw_data[j];
5236 
5237  // in case we're in RLE mode, keep counting down
5238  --RLE_count;
5239  }
5240  // do I need to invert the image?
5241  if ( tga_inverted )
5242  {
5243  for (j = 0; j*2 < tga_height; ++j)
5244  {
5245  int index1 = j * tga_width * tga_comp;
5246  int index2 = (tga_height - 1 - j) * tga_width * tga_comp;
5247  for (i = tga_width * tga_comp; i > 0; --i)
5248  {
5249  unsigned char temp = tga_data[index1];
5250  tga_data[index1] = tga_data[index2];
5251  tga_data[index2] = temp;
5252  ++index1;
5253  ++index2;
5254  }
5255  }
5256  }
5257  // clear my palette, if I had one
5258  if ( tga_palette != NULL )
5259  {
5260  STBI_FREE( tga_palette );
5261  }
5262  }
5263 
5264  // swap RGB - if the source data was RGB16, it already is in the right order
5265  if (tga_comp >= 3 && !tga_rgb16)
5266  {
5267  unsigned char* tga_pixel = tga_data;
5268  for (i=0; i < tga_width * tga_height; ++i)
5269  {
5270  unsigned char temp = tga_pixel[0];
5271  tga_pixel[0] = tga_pixel[2];
5272  tga_pixel[2] = temp;
5273  tga_pixel += tga_comp;
5274  }
5275  }
5276 
5277  // convert to target component count
5278  if (req_comp && req_comp != tga_comp)
5279  tga_data = stbi__convert_format(tga_data, tga_comp, req_comp, tga_width, tga_height);
5280 
5281  // the things I do to get rid of an error message, and yet keep
5282  // Microsoft's C compilers happy... [8^(
5283  tga_palette_start = tga_palette_len = tga_palette_bits =
5284  tga_x_origin = tga_y_origin = 0;
5285  // OK, done
5286  return tga_data;
5287 }
5288 #endif
5289 
5290 // *************************************************************************************************
5291 // Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB
5292 
5293 #ifndef STBI_NO_PSD
5294 static int stbi__psd_test(stbi__context *s)
5295 {
5296  int r = (stbi__get32be(s) == 0x38425053);
5297  stbi__rewind(s);
5298  return r;
5299 }
5300 
5301 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5302 {
5303  int pixelCount;
5304  int channelCount, compression;
5305  int channel, i, count, len;
5306  int bitdepth;
5307  int w,h;
5308  stbi_uc *out;
5309 
5310  // Check identifier
5311  if (stbi__get32be(s) != 0x38425053) // "8BPS"
5312  return stbi__errpuc("not PSD", "Corrupt PSD image");
5313 
5314  // Check file type version.
5315  if (stbi__get16be(s) != 1)
5316  return stbi__errpuc("wrong version", "Unsupported version of PSD image");
5317 
5318  // Skip 6 reserved bytes.
5319  stbi__skip(s, 6 );
5320 
5321  // Read the number of channels (R, G, B, A, etc).
5322  channelCount = stbi__get16be(s);
5323  if (channelCount < 0 || channelCount > 16)
5324  return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image");
5325 
5326  // Read the rows and columns of the image.
5327  h = stbi__get32be(s);
5328  w = stbi__get32be(s);
5329 
5330  // Make sure the depth is 8 bits.
5331  bitdepth = stbi__get16be(s);
5332  if (bitdepth != 8 && bitdepth != 16)
5333  return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 or 16 bit");
5334 
5335  // Make sure the color mode is RGB.
5336  // Valid options are:
5337  // 0: Bitmap
5338  // 1: Grayscale
5339  // 2: Indexed color
5340  // 3: RGB color
5341  // 4: CMYK color
5342  // 7: Multichannel
5343  // 8: Duotone
5344  // 9: Lab color
5345  if (stbi__get16be(s) != 3)
5346  return stbi__errpuc("wrong color format", "PSD is not in RGB color format");
5347 
5348  // Skip the Mode Data. (It's the palette for indexed color; other info for other modes.)
5349  stbi__skip(s,stbi__get32be(s) );
5350 
5351  // Skip the image resources. (resolution, pen tool paths, etc)
5352  stbi__skip(s, stbi__get32be(s) );
5353 
5354  // Skip the reserved data.
5355  stbi__skip(s, stbi__get32be(s) );
5356 
5357  // Find out if the data is compressed.
5358  // Known values:
5359  // 0: no compression
5360  // 1: RLE compressed
5361  compression = stbi__get16be(s);
5362  if (compression > 1)
5363  return stbi__errpuc("bad compression", "PSD has an unknown compression format");
5364 
5365  // Create the destination image.
5366  out = (stbi_uc *) stbi__malloc(4 * w*h);
5367  if (!out) return stbi__errpuc("outofmem", "Out of memory");
5368  pixelCount = w*h;
5369 
5370  // Initialize the data to zero.
5371  //memset( out, 0, pixelCount * 4 );
5372 
5373  // Finally, the image data.
5374  if (compression) {
5375  // RLE as used by .PSD and .TIFF
5376  // Loop until you get the number of unpacked bytes you are expecting:
5377  // Read the next source byte into n.
5378  // If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
5379  // Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
5380  // Else if n is 128, noop.
5381  // Endloop
5382 
5383  // The RLE-compressed data is preceeded by a 2-byte data count for each row in the data,
5384  // which we're going to just skip.
5385  stbi__skip(s, h * channelCount * 2 );
5386 
5387  // Read the RLE data by channel.
5388  for (channel = 0; channel < 4; channel++) {
5389  stbi_uc *p;
5390 
5391  p = out+channel;
5392  if (channel >= channelCount) {
5393  // Fill this channel with default data.
5394  for (i = 0; i < pixelCount; i++, p += 4)
5395  *p = (channel == 3 ? 255 : 0);
5396  } else {
5397  // Read the RLE data.
5398  count = 0;
5399  while (count < pixelCount) {
5400  len = stbi__get8(s);
5401  if (len == 128) {
5402  // No-op.
5403  } else if (len < 128) {
5404  // Copy next len+1 bytes literally.
5405  len++;
5406  count += len;
5407  while (len) {
5408  *p = stbi__get8(s);
5409  p += 4;
5410  len--;
5411  }
5412  } else if (len > 128) {
5413  stbi_uc val;
5414  // Next -len+1 bytes in the dest are replicated from next source byte.
5415  // (Interpret len as a negative 8-bit int.)
5416  len ^= 0x0FF;
5417  len += 2;
5418  val = stbi__get8(s);
5419  count += len;
5420  while (len) {
5421  *p = val;
5422  p += 4;
5423  len--;
5424  }
5425  }
5426  }
5427  }
5428  }
5429 
5430  } else {
5431  // We're at the raw image data. It's each channel in order (Red, Green, Blue, Alpha, ...)
5432  // where each channel consists of an 8-bit value for each pixel in the image.
5433 
5434  // Read the data by channel.
5435  for (channel = 0; channel < 4; channel++) {
5436  stbi_uc *p;
5437 
5438  p = out + channel;
5439  if (channel >= channelCount) {
5440  // Fill this channel with default data.
5441  stbi_uc val = channel == 3 ? 255 : 0;
5442  for (i = 0; i < pixelCount; i++, p += 4)
5443  *p = val;
5444  } else {
5445  // Read the data.
5446  if (bitdepth == 16) {
5447  for (i = 0; i < pixelCount; i++, p += 4)
5448  *p = (stbi_uc) (stbi__get16be(s) >> 8);
5449  } else {
5450  for (i = 0; i < pixelCount; i++, p += 4)
5451  *p = stbi__get8(s);
5452  }
5453  }
5454  }
5455  }
5456 
5457  if (channelCount >= 4) {
5458  for (i=0; i < w*h; ++i) {
5459  unsigned char *pixel = out + 4*i;
5460  if (pixel[3] != 0 && pixel[3] != 255) {
5461  // remove weird white matte from PSD
5462  float a = pixel[3] / 255.0f;
5463  float ra = 1.0f / a;
5464  float inv_a = 255.0f * (1 - ra);
5465  pixel[0] = (unsigned char) (pixel[0]*ra + inv_a);
5466  pixel[1] = (unsigned char) (pixel[1]*ra + inv_a);
5467  pixel[2] = (unsigned char) (pixel[2]*ra + inv_a);
5468  }
5469  }
5470  }
5471 
5472  if (req_comp && req_comp != 4) {
5473  out = stbi__convert_format(out, 4, req_comp, w, h);
5474  if (out == NULL) return out; // stbi__convert_format frees input on failure
5475  }
5476 
5477  if (comp) *comp = 4;
5478  *y = h;
5479  *x = w;
5480 
5481  return out;
5482 }
5483 #endif
5484 
5485 // *************************************************************************************************
5486 // Softimage PIC loader
5487 // by Tom Seddon
5488 //
5489 // See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
5490 // See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
5491 
5492 #ifndef STBI_NO_PIC
5493 static int stbi__pic_is4(stbi__context *s,const char *str)
5494 {
5495  int i;
5496  for (i=0; i<4; ++i)
5497  if (stbi__get8(s) != (stbi_uc)str[i])
5498  return 0;
5499 
5500  return 1;
5501 }
5502 
5503 static int stbi__pic_test_core(stbi__context *s)
5504 {
5505  int i;
5506 
5507  if (!stbi__pic_is4(s,"\x53\x80\xF6\x34"))
5508  return 0;
5509 
5510  for(i=0;i<84;++i)
5511  stbi__get8(s);
5512 
5513  if (!stbi__pic_is4(s,"PICT"))
5514  return 0;
5515 
5516  return 1;
5517 }
5518 
5519 typedef struct
5520 {
5521  stbi_uc size,type,channel;
5522 } stbi__pic_packet;
5523 
5524 static stbi_uc *stbi__readval(stbi__context *s, int channel, stbi_uc *dest)
5525 {
5526  int mask=0x80, i;
5527 
5528  for (i=0; i<4; ++i, mask>>=1) {
5529  if (channel & mask) {
5530  if (stbi__at_eof(s)) return stbi__errpuc("bad file","PIC file too short");
5531  dest[i]=stbi__get8(s);
5532  }
5533  }
5534 
5535  return dest;
5536 }
5537 
5538 static void stbi__copyval(int channel,stbi_uc *dest,const stbi_uc *src)
5539 {
5540  int mask=0x80,i;
5541 
5542  for (i=0;i<4; ++i, mask>>=1)
5543  if (channel&mask)
5544  dest[i]=src[i];
5545 }
5546 
5547 static stbi_uc *stbi__pic_load_core(stbi__context *s,int width,int height,int *comp, stbi_uc *result)
5548 {
5549  int act_comp=0,num_packets=0,y,chained;
5550  stbi__pic_packet packets[10];
5551 
5552  // this will (should...) cater for even some bizarre stuff like having data
5553  // for the same channel in multiple packets.
5554  do {
5555  stbi__pic_packet *packet;
5556 
5557  if (num_packets==sizeof(packets)/sizeof(packets[0]))
5558  return stbi__errpuc("bad format","too many packets");
5559 
5560  packet = &packets[num_packets++];
5561 
5562  chained = stbi__get8(s);
5563  packet->size = stbi__get8(s);
5564  packet->type = stbi__get8(s);
5565  packet->channel = stbi__get8(s);
5566 
5567  act_comp |= packet->channel;
5568 
5569  if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (reading packets)");
5570  if (packet->size != 8) return stbi__errpuc("bad format","packet isn't 8bpp");
5571  } while (chained);
5572 
5573  *comp = (act_comp & 0x10 ? 4 : 3); // has alpha channel?
5574 
5575  for(y=0; y<height; ++y) {
5576  int packet_idx;
5577 
5578  for(packet_idx=0; packet_idx < num_packets; ++packet_idx) {
5579  stbi__pic_packet *packet = &packets[packet_idx];
5580  stbi_uc *dest = result+y*width*4;
5581 
5582  switch (packet->type) {
5583  default:
5584  return stbi__errpuc("bad format","packet has bad compression type");
5585 
5586  case 0: {//uncompressed
5587  int x;
5588 
5589  for(x=0;x<width;++x, dest+=4)
5590  if (!stbi__readval(s,packet->channel,dest))
5591  return 0;
5592  break;
5593  }
5594 
5595  case 1://Pure RLE
5596  {
5597  int left=width, i;
5598 
5599  while (left>0) {
5600  stbi_uc count,value[4];
5601 
5602  count=stbi__get8(s);
5603  if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pure read count)");
5604 
5605  if (count > left)
5606  count = (stbi_uc) left;
5607 
5608  if (!stbi__readval(s,packet->channel,value)) return 0;
5609 
5610  for(i=0; i<count; ++i,dest+=4)
5611  stbi__copyval(packet->channel,dest,value);
5612  left -= count;
5613  }
5614  }
5615  break;
5616 
5617  case 2: {//Mixed RLE
5618  int left=width;
5619  while (left>0) {
5620  int count = stbi__get8(s), i;
5621  if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (mixed read count)");
5622 
5623  if (count >= 128) { // Repeated
5624  stbi_uc value[4];
5625 
5626  if (count==128)
5627  count = stbi__get16be(s);
5628  else
5629  count -= 127;
5630  if (count > left)
5631  return stbi__errpuc("bad file","scanline overrun");
5632 
5633  if (!stbi__readval(s,packet->channel,value))
5634  return 0;
5635 
5636  for(i=0;i<count;++i, dest += 4)
5637  stbi__copyval(packet->channel,dest,value);
5638  } else { // Raw
5639  ++count;
5640  if (count>left) return stbi__errpuc("bad file","scanline overrun");
5641 
5642  for(i=0;i<count;++i, dest+=4)
5643  if (!stbi__readval(s,packet->channel,dest))
5644  return 0;
5645  }
5646  left-=count;
5647  }
5648  break;
5649  }
5650  }
5651  }
5652  }
5653 
5654  return result;
5655 }
5656 
5657 static stbi_uc *stbi__pic_load(stbi__context *s,int *px,int *py,int *comp,int req_comp)
5658 {
5659  stbi_uc *result;
5660  int i, x,y;
5661 
5662  for (i=0; i<92; ++i)
5663  stbi__get8(s);
5664 
5665  x = stbi__get16be(s);
5666  y = stbi__get16be(s);
5667  if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pic header)");
5668  if ((1 << 28) / x < y) return stbi__errpuc("too large", "Image too large to decode");
5669 
5670  stbi__get32be(s); //skip `ratio'
5671  stbi__get16be(s); //skip `fields'
5672  stbi__get16be(s); //skip `pad'
5673 
5674  // intermediate buffer is RGBA
5675  result = (stbi_uc *) stbi__malloc(x*y*4);
5676  memset(result, 0xff, x*y*4);
5677 
5678  if (!stbi__pic_load_core(s,x,y,comp, result)) {
5679  STBI_FREE(result);
5680  result=0;
5681  }
5682  *px = x;
5683  *py = y;
5684  if (req_comp == 0) req_comp = *comp;
5685  result=stbi__convert_format(result,4,req_comp,x,y);
5686 
5687  return result;
5688 }
5689 
5690 static int stbi__pic_test(stbi__context *s)
5691 {
5692  int r = stbi__pic_test_core(s);
5693  stbi__rewind(s);
5694  return r;
5695 }
5696 #endif
5697 
5698 // *************************************************************************************************
5699 // GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
5700 
5701 #ifndef STBI_NO_GIF
5702 typedef struct
5703 {
5704  stbi__int16 prefix;
5705  stbi_uc first;
5706  stbi_uc suffix;
5707 } stbi__gif_lzw;
5708 
5709 typedef struct
5710 {
5711  int w,h;
5712  stbi_uc *out, *old_out; // output buffer (always 4 components)
5713  int flags, bgindex, ratio, transparent, eflags, delay;
5714  stbi_uc pal[256][4];
5715  stbi_uc lpal[256][4];
5716  stbi__gif_lzw codes[4096];
5717  stbi_uc *color_table;
5718  int parse, step;
5719  int lflags;
5720  int start_x, start_y;
5721  int max_x, max_y;
5722  int cur_x, cur_y;
5723  int line_size;
5724 } stbi__gif;
5725 
5726 static int stbi__gif_test_raw(stbi__context *s)
5727 {
5728  int sz;
5729  if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8') return 0;
5730  sz = stbi__get8(s);
5731  if (sz != '9' && sz != '7') return 0;
5732  if (stbi__get8(s) != 'a') return 0;
5733  return 1;
5734 }
5735 
5736 static int stbi__gif_test(stbi__context *s)
5737 {
5738  int r = stbi__gif_test_raw(s);
5739  stbi__rewind(s);
5740  return r;
5741 }
5742 
5743 static void stbi__gif_parse_colortable(stbi__context *s, stbi_uc pal[256][4], int num_entries, int transp)
5744 {
5745  int i;
5746  for (i=0; i < num_entries; ++i) {
5747  pal[i][2] = stbi__get8(s);
5748  pal[i][1] = stbi__get8(s);
5749  pal[i][0] = stbi__get8(s);
5750  pal[i][3] = transp == i ? 0 : 255;
5751  }
5752 }
5753 
5754 static int stbi__gif_header(stbi__context *s, stbi__gif *g, int *comp, int is_info)
5755 {
5756  stbi_uc version;
5757  if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8')
5758  return stbi__err("not GIF", "Corrupt GIF");
5759 
5760  version = stbi__get8(s);
5761  if (version != '7' && version != '9') return stbi__err("not GIF", "Corrupt GIF");
5762  if (stbi__get8(s) != 'a') return stbi__err("not GIF", "Corrupt GIF");
5763 
5764  stbi__g_failure_reason = "";
5765  g->w = stbi__get16le(s);
5766  g->h = stbi__get16le(s);
5767  g->flags = stbi__get8(s);
5768  g->bgindex = stbi__get8(s);
5769  g->ratio = stbi__get8(s);
5770  g->transparent = -1;
5771 
5772  if (comp != 0) *comp = 4; // can't actually tell whether it's 3 or 4 until we parse the comments
5773 
5774  if (is_info) return 1;
5775 
5776  if (g->flags & 0x80)
5777  stbi__gif_parse_colortable(s,g->pal, 2 << (g->flags & 7), -1);
5778 
5779  return 1;
5780 }
5781 
5782 static int stbi__gif_info_raw(stbi__context *s, int *x, int *y, int *comp)
5783 {
5784  stbi__gif* g = (stbi__gif*) stbi__malloc(sizeof(stbi__gif));
5785  if (!stbi__gif_header(s, g, comp, 1)) {
5786  STBI_FREE(g);
5787  stbi__rewind( s );
5788  return 0;
5789  }
5790  if (x) *x = g->w;
5791  if (y) *y = g->h;
5792  STBI_FREE(g);
5793  return 1;
5794 }
5795 
5796 static void stbi__out_gif_code(stbi__gif *g, stbi__uint16 code)
5797 {
5798  stbi_uc *p, *c;
5799 
5800  // recurse to decode the prefixes, since the linked-list is backwards,
5801  // and working backwards through an interleaved image would be nasty
5802  if (g->codes[code].prefix >= 0)
5803  stbi__out_gif_code(g, g->codes[code].prefix);
5804 
5805  if (g->cur_y >= g->max_y) return;
5806 
5807  p = &g->out[g->cur_x + g->cur_y];
5808  c = &g->color_table[g->codes[code].suffix * 4];
5809 
5810  if (c[3] >= 128) {
5811  p[0] = c[2];
5812  p[1] = c[1];
5813  p[2] = c[0];
5814  p[3] = c[3];
5815  }
5816  g->cur_x += 4;
5817 
5818  if (g->cur_x >= g->max_x) {
5819  g->cur_x = g->start_x;
5820  g->cur_y += g->step;
5821 
5822  while (g->cur_y >= g->max_y && g->parse > 0) {
5823  g->step = (1 << g->parse) * g->line_size;
5824  g->cur_y = g->start_y + (g->step >> 1);
5825  --g->parse;
5826  }
5827  }
5828 }
5829 
5830 static stbi_uc *stbi__process_gif_raster(stbi__context *s, stbi__gif *g)
5831 {
5832  stbi_uc lzw_cs;
5833  stbi__int32 len, init_code;
5834  stbi__uint32 first;
5835  stbi__int32 codesize, codemask, avail, oldcode, bits, valid_bits, clear;
5836  stbi__gif_lzw *p;
5837 
5838  lzw_cs = stbi__get8(s);
5839  if (lzw_cs > 12) return NULL;
5840  clear = 1 << lzw_cs;
5841  first = 1;
5842  codesize = lzw_cs + 1;
5843  codemask = (1 << codesize) - 1;
5844  bits = 0;
5845  valid_bits = 0;
5846  for (init_code = 0; init_code < clear; init_code++) {
5847  g->codes[init_code].prefix = -1;
5848  g->codes[init_code].first = (stbi_uc) init_code;
5849  g->codes[init_code].suffix = (stbi_uc) init_code;
5850  }
5851 
5852  // support no starting clear code
5853  avail = clear+2;
5854  oldcode = -1;
5855 
5856  len = 0;
5857  for(;;) {
5858  if (valid_bits < codesize) {
5859  if (len == 0) {
5860  len = stbi__get8(s); // start new block
5861  if (len == 0)
5862  return g->out;
5863  }
5864  --len;
5865  bits |= (stbi__int32) stbi__get8(s) << valid_bits;
5866  valid_bits += 8;
5867  } else {
5868  stbi__int32 code = bits & codemask;
5869  bits >>= codesize;
5870  valid_bits -= codesize;
5871  // @OPTIMIZE: is there some way we can accelerate the non-clear path?
5872  if (code == clear) { // clear code
5873  codesize = lzw_cs + 1;
5874  codemask = (1 << codesize) - 1;
5875  avail = clear + 2;
5876  oldcode = -1;
5877  first = 0;
5878  } else if (code == clear + 1) { // end of stream code
5879  stbi__skip(s, len);
5880  while ((len = stbi__get8(s)) > 0)
5881  stbi__skip(s,len);
5882  return g->out;
5883  } else if (code <= avail) {
5884  if (first) return stbi__errpuc("no clear code", "Corrupt GIF");
5885 
5886  if (oldcode >= 0) {
5887  p = &g->codes[avail++];
5888  if (avail > 4096) return stbi__errpuc("too many codes", "Corrupt GIF");
5889  p->prefix = (stbi__int16) oldcode;
5890  p->first = g->codes[oldcode].first;
5891  p->suffix = (code == avail) ? p->first : g->codes[code].first;
5892  } else if (code == avail)
5893  return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5894 
5895  stbi__out_gif_code(g, (stbi__uint16) code);
5896 
5897  if ((avail & codemask) == 0 && avail <= 0x0FFF) {
5898  codesize++;
5899  codemask = (1 << codesize) - 1;
5900  }
5901 
5902  oldcode = code;
5903  } else {
5904  return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5905  }
5906  }
5907  }
5908 }
5909 
5910 static void stbi__fill_gif_background(stbi__gif *g, int x0, int y0, int x1, int y1)
5911 {
5912  int x, y;
5913  stbi_uc *c = g->pal[g->bgindex];
5914  for (y = y0; y < y1; y += 4 * g->w) {
5915  for (x = x0; x < x1; x += 4) {
5916  stbi_uc *p = &g->out[y + x];
5917  p[0] = c[2];
5918  p[1] = c[1];
5919  p[2] = c[0];
5920  p[3] = 0;
5921  }
5922  }
5923 }
5924 
5925 // this function is designed to support animated gifs, although stb_image doesn't support it
5926 static stbi_uc *stbi__gif_load_next(stbi__context *s, stbi__gif *g, int *comp, int req_comp)
5927 {
5928  int i;
5929  stbi_uc *prev_out = 0;
5930 
5931  if (g->out == 0 && !stbi__gif_header(s, g, comp,0))
5932  return 0; // stbi__g_failure_reason set by stbi__gif_header
5933 
5934  prev_out = g->out;
5935  g->out = (stbi_uc *) stbi__malloc(4 * g->w * g->h);
5936  if (g->out == 0) return stbi__errpuc("outofmem", "Out of memory");
5937 
5938  switch ((g->eflags & 0x1C) >> 2) {
5939  case 0: // unspecified (also always used on 1st frame)
5940  stbi__fill_gif_background(g, 0, 0, 4 * g->w, 4 * g->w * g->h);
5941  break;
5942  case 1: // do not dispose
5943  if (prev_out) memcpy(g->out, prev_out, 4 * g->w * g->h);
5944  g->old_out = prev_out;
5945  break;
5946  case 2: // dispose to background
5947  if (prev_out) memcpy(g->out, prev_out, 4 * g->w * g->h);
5948  stbi__fill_gif_background(g, g->start_x, g->start_y, g->max_x, g->max_y);
5949  break;
5950  case 3: // dispose to previous
5951  if (g->old_out) {
5952  for (i = g->start_y; i < g->max_y; i += 4 * g->w)
5953  memcpy(&g->out[i + g->start_x], &g->old_out[i + g->start_x], g->max_x - g->start_x);
5954  }
5955  break;
5956  }
5957 
5958  for (;;) {
5959  switch (stbi__get8(s)) {
5960  case 0x2C: /* Image Descriptor */
5961  {
5962  int prev_trans = -1;
5963  stbi__int32 x, y, w, h;
5964  stbi_uc *o;
5965 
5966  x = stbi__get16le(s);
5967  y = stbi__get16le(s);
5968  w = stbi__get16le(s);
5969  h = stbi__get16le(s);
5970  if (((x + w) > (g->w)) || ((y + h) > (g->h)))
5971  return stbi__errpuc("bad Image Descriptor", "Corrupt GIF");
5972 
5973  g->line_size = g->w * 4;
5974  g->start_x = x * 4;
5975  g->start_y = y * g->line_size;
5976  g->max_x = g->start_x + w * 4;
5977  g->max_y = g->start_y + h * g->line_size;
5978  g->cur_x = g->start_x;
5979  g->cur_y = g->start_y;
5980 
5981  g->lflags = stbi__get8(s);
5982 
5983  if (g->lflags & 0x40) {
5984  g->step = 8 * g->line_size; // first interlaced spacing
5985  g->parse = 3;
5986  } else {
5987  g->step = g->line_size;
5988  g->parse = 0;
5989  }
5990 
5991  if (g->lflags & 0x80) {
5992  stbi__gif_parse_colortable(s,g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent : -1);
5993  g->color_table = (stbi_uc *) g->lpal;
5994  } else if (g->flags & 0x80) {
5995  if (g->transparent >= 0 && (g->eflags & 0x01)) {
5996  prev_trans = g->pal[g->transparent][3];
5997  g->pal[g->transparent][3] = 0;
5998  }
5999  g->color_table = (stbi_uc *) g->pal;
6000  } else
6001  return stbi__errpuc("missing color table", "Corrupt GIF");
6002 
6003  o = stbi__process_gif_raster(s, g);
6004  if (o == NULL) return NULL;
6005 
6006  if (prev_trans != -1)
6007  g->pal[g->transparent][3] = (stbi_uc) prev_trans;
6008 
6009  return o;
6010  }
6011 
6012  case 0x21: // Comment Extension.
6013  {
6014  int len;
6015  if (stbi__get8(s) == 0xF9) { // Graphic Control Extension.
6016  len = stbi__get8(s);
6017  if (len == 4) {
6018  g->eflags = stbi__get8(s);
6019  g->delay = stbi__get16le(s);
6020  g->transparent = stbi__get8(s);
6021  } else {
6022  stbi__skip(s, len);
6023  break;
6024  }
6025  }
6026  while ((len = stbi__get8(s)) != 0)
6027  stbi__skip(s, len);
6028  break;
6029  }
6030 
6031  case 0x3B: // gif stream termination code
6032  return (stbi_uc *) s; // using '1' causes warning on some compilers
6033 
6034  default:
6035  return stbi__errpuc("unknown code", "Corrupt GIF");
6036  }
6037  }
6038 
6039  STBI_NOTUSED(req_comp);
6040 }
6041 
6042 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
6043 {
6044  stbi_uc *u = 0;
6045  stbi__gif* g = (stbi__gif*) stbi__malloc(sizeof(stbi__gif));
6046  memset(g, 0, sizeof(*g));
6047 
6048  u = stbi__gif_load_next(s, g, comp, req_comp);
6049  if (u == (stbi_uc *) s) u = 0; // end of animated gif marker
6050  if (u) {
6051  *x = g->w;
6052  *y = g->h;
6053  if (req_comp && req_comp != 4)
6054  u = stbi__convert_format(u, 4, req_comp, g->w, g->h);
6055  }
6056  else if (g->out)
6057  STBI_FREE(g->out);
6058  STBI_FREE(g);
6059  return u;
6060 }
6061 
6062 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp)
6063 {
6064  return stbi__gif_info_raw(s,x,y,comp);
6065 }
6066 #endif
6067 
6068 // *************************************************************************************************
6069 // Radiance RGBE HDR loader
6070 // originally by Nicolas Schulz
6071 #ifndef STBI_NO_HDR
6072 static int stbi__hdr_test_core(stbi__context *s)
6073 {
6074  const char *signature = "#?RADIANCE\n";
6075  int i;
6076  for (i=0; signature[i]; ++i)
6077  if (stbi__get8(s) != signature[i])
6078  return 0;
6079  return 1;
6080 }
6081 
6082 static int stbi__hdr_test(stbi__context* s)
6083 {
6084  int r = stbi__hdr_test_core(s);
6085  stbi__rewind(s);
6086  return r;
6087 }
6088 
6089 #define STBI__HDR_BUFLEN 1024
6090 static char *stbi__hdr_gettoken(stbi__context *z, char *buffer)
6091 {
6092  int len=0;
6093  char c = '\0';
6094 
6095  c = (char) stbi__get8(z);
6096 
6097  while (!stbi__at_eof(z) && c != '\n') {
6098  buffer[len++] = c;
6099  if (len == STBI__HDR_BUFLEN-1) {
6100  // flush to end of line
6101  while (!stbi__at_eof(z) && stbi__get8(z) != '\n')
6102  ;
6103  break;
6104  }
6105  c = (char) stbi__get8(z);
6106  }
6107 
6108  buffer[len] = 0;
6109  return buffer;
6110 }
6111 
6112 static void stbi__hdr_convert(float *output, stbi_uc *input, int req_comp)
6113 {
6114  if ( input[3] != 0 ) {
6115  float f1;
6116  // Exponent
6117  f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8));
6118  if (req_comp <= 2)
6119  output[0] = (input[0] + input[1] + input[2]) * f1 / 3;
6120  else {
6121  output[0] = input[0] * f1;
6122  output[1] = input[1] * f1;
6123  output[2] = input[2] * f1;
6124  }
6125  if (req_comp == 2) output[1] = 1;
6126  if (req_comp == 4) output[3] = 1;
6127  } else {
6128  switch (req_comp) {
6129  case 4: output[3] = 1; /* fallthrough */
6130  case 3: output[0] = output[1] = output[2] = 0;
6131  break;
6132  case 2: output[1] = 1; /* fallthrough */
6133  case 1: output[0] = 0;
6134  break;
6135  }
6136  }
6137 }
6138 
6139 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
6140 {
6141  char buffer[STBI__HDR_BUFLEN];
6142  char *token;
6143  int valid = 0;
6144  int width, height;
6145  stbi_uc *scanline;
6146  float *hdr_data;
6147  int len;
6148  unsigned char count, value;
6149  int i, j, k, c1,c2, z;
6150 
6151 
6152  // Check identifier
6153  if (strcmp(stbi__hdr_gettoken(s,buffer), "#?RADIANCE") != 0)
6154  return stbi__errpf("not HDR", "Corrupt HDR image");
6155 
6156  // Parse header
6157  for(;;) {
6158  token = stbi__hdr_gettoken(s,buffer);
6159  if (token[0] == 0) break;
6160  if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
6161  }
6162 
6163  if (!valid) return stbi__errpf("unsupported format", "Unsupported HDR format");
6164 
6165  // Parse width and height
6166  // can't use sscanf() if we're not using stdio!
6167  token = stbi__hdr_gettoken(s,buffer);
6168  if (strncmp(token, "-Y ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
6169  token += 3;
6170  height = (int) strtol(token, &token, 10);
6171  while (*token == ' ') ++token;
6172  if (strncmp(token, "+X ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
6173  token += 3;
6174  width = (int) strtol(token, NULL, 10);
6175 
6176  *x = width;
6177  *y = height;
6178 
6179  if (comp) *comp = 3;
6180  if (req_comp == 0) req_comp = 3;
6181 
6182  // Read data
6183  hdr_data = (float *) stbi__malloc(height * width * req_comp * sizeof(float));
6184 
6185  // Load image data
6186  // image data is stored as some number of sca
6187  if ( width < 8 || width >= 32768) {
6188  // Read flat data
6189  for (j=0; j < height; ++j) {
6190  for (i=0; i < width; ++i) {
6191  stbi_uc rgbe[4];
6192  main_decode_loop:
6193  stbi__getn(s, rgbe, 4);
6194  stbi__hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp);
6195  }
6196  }
6197  } else {
6198  // Read RLE-encoded data
6199  scanline = NULL;
6200 
6201  for (j = 0; j < height; ++j) {
6202  c1 = stbi__get8(s);
6203  c2 = stbi__get8(s);
6204  len = stbi__get8(s);
6205  if (c1 != 2 || c2 != 2 || (len & 0x80)) {
6206  // not run-length encoded, so we have to actually use THIS data as a decoded
6207  // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
6208  stbi_uc rgbe[4];
6209  rgbe[0] = (stbi_uc) c1;
6210  rgbe[1] = (stbi_uc) c2;
6211  rgbe[2] = (stbi_uc) len;
6212  rgbe[3] = (stbi_uc) stbi__get8(s);
6213  stbi__hdr_convert(hdr_data, rgbe, req_comp);
6214  i = 1;
6215  j = 0;
6216  STBI_FREE(scanline);
6217  goto main_decode_loop; // yes, this makes no sense
6218  }
6219  len <<= 8;
6220  len |= stbi__get8(s);
6221  if (len != width) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("invalid decoded scanline length", "corrupt HDR"); }
6222  if (scanline == NULL) scanline = (stbi_uc *) stbi__malloc(width * 4);
6223 
6224  for (k = 0; k < 4; ++k) {
6225  i = 0;
6226  while (i < width) {
6227  count = stbi__get8(s);
6228  if (count > 128) {
6229  // Run
6230  value = stbi__get8(s);
6231  count -= 128;
6232  for (z = 0; z < count; ++z)
6233  scanline[i++ * 4 + k] = value;
6234  } else {
6235  // Dump
6236  for (z = 0; z < count; ++z)
6237  scanline[i++ * 4 + k] = stbi__get8(s);
6238  }
6239  }
6240  }
6241  for (i=0; i < width; ++i)
6242  stbi__hdr_convert(hdr_data+(j*width + i)*req_comp, scanline + i*4, req_comp);
6243  }
6244  STBI_FREE(scanline);
6245  }
6246 
6247  return hdr_data;
6248 }
6249 
6250 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp)
6251 {
6252  char buffer[STBI__HDR_BUFLEN];
6253  char *token;
6254  int valid = 0;
6255 
6256  if (stbi__hdr_test(s) == 0) {
6257  stbi__rewind( s );
6258  return 0;
6259  }
6260 
6261  for(;;) {
6262  token = stbi__hdr_gettoken(s,buffer);
6263  if (token[0] == 0) break;
6264  if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
6265  }
6266 
6267  if (!valid) {
6268  stbi__rewind( s );
6269  return 0;
6270  }
6271  token = stbi__hdr_gettoken(s,buffer);
6272  if (strncmp(token, "-Y ", 3)) {
6273  stbi__rewind( s );
6274  return 0;
6275  }
6276  token += 3;
6277  *y = (int) strtol(token, &token, 10);
6278  while (*token == ' ') ++token;
6279  if (strncmp(token, "+X ", 3)) {
6280  stbi__rewind( s );
6281  return 0;
6282  }
6283  token += 3;
6284  *x = (int) strtol(token, NULL, 10);
6285  *comp = 3;
6286  return 1;
6287 }
6288 #endif // STBI_NO_HDR
6289 
6290 #ifndef STBI_NO_BMP
6291 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp)
6292 {
6293  void *p;
6294  stbi__bmp_data info;
6295 
6296  info.all_a = 255;
6297  p = stbi__bmp_parse_header(s, &info);
6298  stbi__rewind( s );
6299  if (p == NULL)
6300  return 0;
6301  *x = s->img_x;
6302  *y = s->img_y;
6303  *comp = info.ma ? 4 : 3;
6304  return 1;
6305 }
6306 #endif
6307 
6308 #ifndef STBI_NO_PSD
6309 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp)
6310 {
6311  int channelCount;
6312  if (stbi__get32be(s) != 0x38425053) {
6313  stbi__rewind( s );
6314  return 0;
6315  }
6316  if (stbi__get16be(s) != 1) {
6317  stbi__rewind( s );
6318  return 0;
6319  }
6320  stbi__skip(s, 6);
6321  channelCount = stbi__get16be(s);
6322  if (channelCount < 0 || channelCount > 16) {
6323  stbi__rewind( s );
6324  return 0;
6325  }
6326  *y = stbi__get32be(s);
6327  *x = stbi__get32be(s);
6328  if (stbi__get16be(s) != 8) {
6329  stbi__rewind( s );
6330  return 0;
6331  }
6332  if (stbi__get16be(s) != 3) {
6333  stbi__rewind( s );
6334  return 0;
6335  }
6336  *comp = 4;
6337  return 1;
6338 }
6339 #endif
6340 
6341 #ifndef STBI_NO_PIC
6342 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp)
6343 {
6344  int act_comp=0,num_packets=0,chained;
6345  stbi__pic_packet packets[10];
6346 
6347  if (!stbi__pic_is4(s,"\x53\x80\xF6\x34")) {
6348  stbi__rewind(s);
6349  return 0;
6350  }
6351 
6352  stbi__skip(s, 88);
6353 
6354  *x = stbi__get16be(s);
6355  *y = stbi__get16be(s);
6356  if (stbi__at_eof(s)) {
6357  stbi__rewind( s);
6358  return 0;
6359  }
6360  if ( (*x) != 0 && (1 << 28) / (*x) < (*y)) {
6361  stbi__rewind( s );
6362  return 0;
6363  }
6364 
6365  stbi__skip(s, 8);
6366 
6367  do {
6368  stbi__pic_packet *packet;
6369 
6370  if (num_packets==sizeof(packets)/sizeof(packets[0]))
6371  return 0;
6372 
6373  packet = &packets[num_packets++];
6374  chained = stbi__get8(s);
6375  packet->size = stbi__get8(s);
6376  packet->type = stbi__get8(s);
6377  packet->channel = stbi__get8(s);
6378  act_comp |= packet->channel;
6379 
6380  if (stbi__at_eof(s)) {
6381  stbi__rewind( s );
6382  return 0;
6383  }
6384  if (packet->size != 8) {
6385  stbi__rewind( s );
6386  return 0;
6387  }
6388  } while (chained);
6389 
6390  *comp = (act_comp & 0x10 ? 4 : 3);
6391 
6392  return 1;
6393 }
6394 #endif
6395 
6396 // *************************************************************************************************
6397 // Portable Gray Map and Portable Pixel Map loader
6398 // by Ken Miller
6399 //
6400 // PGM: http://netpbm.sourceforge.net/doc/pgm.html
6401 // PPM: http://netpbm.sourceforge.net/doc/ppm.html
6402 //
6403 // Known limitations:
6404 // Does not support comments in the header section
6405 // Does not support ASCII image data (formats P2 and P3)
6406 // Does not support 16-bit-per-channel
6407 
6408 #ifndef STBI_NO_PNM
6409 
6410 static int stbi__pnm_test(stbi__context *s)
6411 {
6412  char p, t;
6413  p = (char) stbi__get8(s);
6414  t = (char) stbi__get8(s);
6415  if (p != 'P' || (t != '5' && t != '6')) {
6416  stbi__rewind( s );
6417  return 0;
6418  }
6419  return 1;
6420 }
6421 
6422 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
6423 {
6424  stbi_uc *out;
6425  if (!stbi__pnm_info(s, (int *)&s->img_x, (int *)&s->img_y, (int *)&s->img_n))
6426  return 0;
6427  *x = s->img_x;
6428  *y = s->img_y;
6429  *comp = s->img_n;
6430 
6431  out = (stbi_uc *) stbi__malloc(s->img_n * s->img_x * s->img_y);
6432  if (!out) return stbi__errpuc("outofmem", "Out of memory");
6433  stbi__getn(s, out, s->img_n * s->img_x * s->img_y);
6434 
6435  if (req_comp && req_comp != s->img_n) {
6436  out = stbi__convert_format(out, s->img_n, req_comp, s->img_x, s->img_y);
6437  if (out == NULL) return out; // stbi__convert_format frees input on failure
6438  }
6439  return out;
6440 }
6441 
6442 static int stbi__pnm_isspace(char c)
6443 {
6444  return c == ' ' || c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r';
6445 }
6446 
6447 static void stbi__pnm_skip_whitespace(stbi__context *s, char *c)
6448 {
6449  for (;;) {
6450  while (!stbi__at_eof(s) && stbi__pnm_isspace(*c))
6451  *c = (char) stbi__get8(s);
6452 
6453  if (stbi__at_eof(s) || *c != '#')
6454  break;
6455 
6456  while (!stbi__at_eof(s) && *c != '\n' && *c != '\r' )
6457  *c = (char) stbi__get8(s);
6458  }
6459 }
6460 
6461 static int stbi__pnm_isdigit(char c)
6462 {
6463  return c >= '0' && c <= '9';
6464 }
6465 
6466 static int stbi__pnm_getinteger(stbi__context *s, char *c)
6467 {
6468  int value = 0;
6469 
6470  while (!stbi__at_eof(s) && stbi__pnm_isdigit(*c)) {
6471  value = value*10 + (*c - '0');
6472  *c = (char) stbi__get8(s);
6473  }
6474 
6475  return value;
6476 }
6477 
6478 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp)
6479 {
6480  int maxv;
6481  char c, p, t;
6482 
6483  stbi__rewind( s );
6484 
6485  // Get identifier
6486  p = (char) stbi__get8(s);
6487  t = (char) stbi__get8(s);
6488  if (p != 'P' || (t != '5' && t != '6')) {
6489  stbi__rewind( s );
6490  return 0;
6491  }
6492 
6493  *comp = (t == '6') ? 3 : 1; // '5' is 1-component .pgm; '6' is 3-component .ppm
6494 
6495  c = (char) stbi__get8(s);
6496  stbi__pnm_skip_whitespace(s, &c);
6497 
6498  *x = stbi__pnm_getinteger(s, &c); // read width
6499  stbi__pnm_skip_whitespace(s, &c);
6500 
6501  *y = stbi__pnm_getinteger(s, &c); // read height
6502  stbi__pnm_skip_whitespace(s, &c);
6503 
6504  maxv = stbi__pnm_getinteger(s, &c); // read max value
6505 
6506  if (maxv > 255)
6507  return stbi__err("max value > 255", "PPM image not 8-bit");
6508  else
6509  return 1;
6510 }
6511 #endif
6512 
6513 static int stbi__info_main(stbi__context *s, int *x, int *y, int *comp)
6514 {
6515  #ifndef STBI_NO_JPEG
6516  if (stbi__jpeg_info(s, x, y, comp)) return 1;
6517  #endif
6518 
6519  #ifndef STBI_NO_PNG
6520  if (stbi__png_info(s, x, y, comp)) return 1;
6521  #endif
6522 
6523  #ifndef STBI_NO_GIF
6524  if (stbi__gif_info(s, x, y, comp)) return 1;
6525  #endif
6526 
6527  #ifndef STBI_NO_BMP
6528  if (stbi__bmp_info(s, x, y, comp)) return 1;
6529  #endif
6530 
6531  #ifndef STBI_NO_PSD
6532  if (stbi__psd_info(s, x, y, comp)) return 1;
6533  #endif
6534 
6535  #ifndef STBI_NO_PIC
6536  if (stbi__pic_info(s, x, y, comp)) return 1;
6537  #endif
6538 
6539  #ifndef STBI_NO_PNM
6540  if (stbi__pnm_info(s, x, y, comp)) return 1;
6541  #endif
6542 
6543  #ifndef STBI_NO_HDR
6544  if (stbi__hdr_info(s, x, y, comp)) return 1;
6545  #endif
6546 
6547  // test tga last because it's a crappy test!
6548  #ifndef STBI_NO_TGA
6549  if (stbi__tga_info(s, x, y, comp))
6550  return 1;
6551  #endif
6552  return stbi__err("unknown image type", "Image not of any known type, or corrupt");
6553 }
6554 
6555 #ifndef STBI_NO_STDIO
6556 STBIDEF int stbi_info(char const *filename, int *x, int *y, int *comp)
6557 {
6558  FILE *f = stbi__fopen(filename, "rb");
6559  int result;
6560  if (!f) return stbi__err("can't fopen", "Unable to open file");
6561  result = stbi_info_from_file(f, x, y, comp);
6562  fclose(f);
6563  return result;
6564 }
6565 
6566 STBIDEF int stbi_info_from_file(FILE *f, int *x, int *y, int *comp)
6567 {
6568  int r;
6569  stbi__context s;
6570  long pos = ftell(f);
6571  stbi__start_file(&s, f);
6572  r = stbi__info_main(&s,x,y,comp);
6573  fseek(f,pos,SEEK_SET);
6574  return r;
6575 }
6576 #endif // !STBI_NO_STDIO
6577 
6578 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
6579 {
6580  stbi__context s;
6581  stbi__start_mem(&s,buffer,len);
6582  return stbi__info_main(&s,x,y,comp);
6583 }
6584 
6585 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *c, void *user, int *x, int *y, int *comp)
6586 {
6587  stbi__context s;
6588  stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
6589  return stbi__info_main(&s,x,y,comp);
6590 }
6591 
6592 #endif // STB_IMAGE_IMPLEMENTATION
6593 
6594 /*
6595  revision history:
6596  2.12 (2016-04-02) fix typo in 2.11 PSD fix that caused crashes
6597  2.11 (2016-04-02) allocate large structures on the stack
6598  remove white matting for transparent PSD
6599  fix reported channel count for PNG & BMP
6600  re-enable SSE2 in non-gcc 64-bit
6601  support RGB-formatted JPEG
6602  read 16-bit PNGs (only as 8-bit)
6603  2.10 (2016-01-22) avoid warning introduced in 2.09 by STBI_REALLOC_SIZED
6604  2.09 (2016-01-16) allow comments in PNM files
6605  16-bit-per-pixel TGA (not bit-per-component)
6606  info() for TGA could break due to .hdr handling
6607  info() for BMP to shares code instead of sloppy parse
6608  can use STBI_REALLOC_SIZED if allocator doesn't support realloc
6609  code cleanup
6610  2.08 (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
6611  2.07 (2015-09-13) fix compiler warnings
6612  partial animated GIF support
6613  limited 16-bpc PSD support
6614  #ifdef unused functions
6615  bug with < 92 byte PIC,PNM,HDR,TGA
6616  2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
6617  2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
6618  2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
6619  2.03 (2015-04-12) extra corruption checking (mmozeiko)
6620  stbi_set_flip_vertically_on_load (nguillemot)
6621  fix NEON support; fix mingw support
6622  2.02 (2015-01-19) fix incorrect assert, fix warning
6623  2.01 (2015-01-17) fix various warnings; suppress SIMD on gcc 32-bit without -msse2
6624  2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
6625  2.00 (2014-12-25) optimize JPG, including x86 SSE2 & NEON SIMD (ryg)
6626  progressive JPEG (stb)
6627  PGM/PPM support (Ken Miller)
6628  STBI_MALLOC,STBI_REALLOC,STBI_FREE
6629  GIF bugfix -- seemingly never worked
6630  STBI_NO_*, STBI_ONLY_*
6631  1.48 (2014-12-14) fix incorrectly-named assert()
6632  1.47 (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb)
6633  optimize PNG (ryg)
6634  fix bug in interlaced PNG with user-specified channel count (stb)
6635  1.46 (2014-08-26)
6636  fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG
6637  1.45 (2014-08-16)
6638  fix MSVC-ARM internal compiler error by wrapping malloc
6639  1.44 (2014-08-07)
6640  various warning fixes from Ronny Chevalier
6641  1.43 (2014-07-15)
6642  fix MSVC-only compiler problem in code changed in 1.42
6643  1.42 (2014-07-09)
6644  don't define _CRT_SECURE_NO_WARNINGS (affects user code)
6645  fixes to stbi__cleanup_jpeg path
6646  added STBI_ASSERT to avoid requiring assert.h
6647  1.41 (2014-06-25)
6648  fix search&replace from 1.36 that messed up comments/error messages
6649  1.40 (2014-06-22)
6650  fix gcc struct-initialization warning
6651  1.39 (2014-06-15)
6652  fix to TGA optimization when req_comp != number of components in TGA;
6653  fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite)
6654  add support for BMP version 5 (more ignored fields)
6655  1.38 (2014-06-06)
6656  suppress MSVC warnings on integer casts truncating values
6657  fix accidental rename of 'skip' field of I/O
6658  1.37 (2014-06-04)
6659  remove duplicate typedef
6660  1.36 (2014-06-03)
6661  convert to header file single-file library
6662  if de-iphone isn't set, load iphone images color-swapped instead of returning NULL
6663  1.35 (2014-05-27)
6664  various warnings
6665  fix broken STBI_SIMD path
6666  fix bug where stbi_load_from_file no longer left file pointer in correct place
6667  fix broken non-easy path for 32-bit BMP (possibly never used)
6668  TGA optimization by Arseny Kapoulkine
6669  1.34 (unknown)
6670  use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case
6671  1.33 (2011-07-14)
6672  make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
6673  1.32 (2011-07-13)
6674  support for "info" function for all supported filetypes (SpartanJ)
6675  1.31 (2011-06-20)
6676  a few more leak fixes, bug in PNG handling (SpartanJ)
6677  1.30 (2011-06-11)
6678  added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
6679  removed deprecated format-specific test/load functions
6680  removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway
6681  error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
6682  fix inefficiency in decoding 32-bit BMP (David Woo)
6683  1.29 (2010-08-16)
6684  various warning fixes from Aurelien Pocheville
6685  1.28 (2010-08-01)
6686  fix bug in GIF palette transparency (SpartanJ)
6687  1.27 (2010-08-01)
6688  cast-to-stbi_uc to fix warnings
6689  1.26 (2010-07-24)
6690  fix bug in file buffering for PNG reported by SpartanJ
6691  1.25 (2010-07-17)
6692  refix trans_data warning (Won Chun)
6693  1.24 (2010-07-12)
6694  perf improvements reading from files on platforms with lock-heavy fgetc()
6695  minor perf improvements for jpeg
6696  deprecated type-specific functions so we'll get feedback if they're needed
6697  attempt to fix trans_data warning (Won Chun)
6698  1.23 fixed bug in iPhone support
6699  1.22 (2010-07-10)
6700  removed image *writing* support
6701  stbi_info support from Jetro Lauha
6702  GIF support from Jean-Marc Lienher
6703  iPhone PNG-extensions from James Brown
6704  warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva)
6705  1.21 fix use of 'stbi_uc' in header (reported by jon blow)
6706  1.20 added support for Softimage PIC, by Tom Seddon
6707  1.19 bug in interlaced PNG corruption check (found by ryg)
6708  1.18 (2008-08-02)
6709  fix a threading bug (local mutable static)
6710  1.17 support interlaced PNG
6711  1.16 major bugfix - stbi__convert_format converted one too many pixels
6712  1.15 initialize some fields for thread safety
6713  1.14 fix threadsafe conversion bug
6714  header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
6715  1.13 threadsafe
6716  1.12 const qualifiers in the API
6717  1.11 Support installable IDCT, colorspace conversion routines
6718  1.10 Fixes for 64-bit (don't use "unsigned long")
6719  optimized upsampling by Fabian "ryg" Giesen
6720  1.09 Fix format-conversion for PSD code (bad global variables!)
6721  1.08 Thatcher Ulrich's PSD code integrated by Nicolas Schulz
6722  1.07 attempt to fix C++ warning/errors again
6723  1.06 attempt to fix C++ warning/errors again
6724  1.05 fix TGA loading to return correct *comp and use good luminance calc
6725  1.04 default float alpha is 1, not 255; use 'void *' for stbi_image_free
6726  1.03 bugfixes to STBI_NO_STDIO, STBI_NO_HDR
6727  1.02 support for (subset of) HDR files, float interface for preferred access to them
6728  1.01 fix bug: possible bug in handling right-side up bmps... not sure
6729  fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all
6730  1.00 interface to zlib that skips zlib header
6731  0.99 correct handling of alpha in palette
6732  0.98 TGA loader by lonesock; dynamically add loaders (untested)
6733  0.97 jpeg errors on too large a file; also catch another malloc failure
6734  0.96 fix detection of invalid v value - particleman@mollyrocket forum
6735  0.95 during header scan, seek to markers in case of padding
6736  0.94 STBI_NO_STDIO to disable stdio usage; rename all #defines the same
6737  0.93 handle jpegtran output; verbose errors
6738  0.92 read 4,8,16,24,32-bit BMP files of several formats
6739  0.91 output 24-bit Windows 3.0 BMP files
6740  0.90 fix a few more warnings; bump version number to approach 1.0
6741  0.61 bugfixes due to Marc LeBlanc, Christopher Lloyd
6742  0.60 fix compiling as c++
6743  0.59 fix warnings: merge Dave Moore's -Wall fixes
6744  0.58 fix bug: zlib uncompressed mode len/nlen was wrong endian
6745  0.57 fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available
6746  0.56 fix bug: zlib uncompressed mode len vs. nlen
6747  0.55 fix bug: restart_interval not initialized to 0
6748  0.54 allow NULL for 'int *comp'
6749  0.53 fix bug in png 3->4; speedup png decoding
6750  0.52 png handles req_comp=3,4 directly; minor cleanup; jpeg comments
6751  0.51 obey req_comp requests, 1-component jpegs return as 1-component,
6752  on 'test' only check type, not whether we support this variant
6753  0.50 (2006-11-19)
6754  first released version
6755 */
#define STBIDEF
Definition: stb_image.h:417
STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
Definition: stb_image.h:403
STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user)
STBIDEF const char * stbi_failure_reason(void)
void pad(std::vector< FILE * > &disks, uint64_t &offset)
Definition: BinaryFusionUtil.h:45
STBIDEF void stbi_hdr_to_ldr_scale(float scale)
STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)
STBIDEF float * stbi_loadf_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
STBIDEF stbi_uc * stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
STBIDEF float * stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
STBIDEF float * stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
STBIDEF void stbi_ldr_to_hdr_gamma(float gamma)
Definition: stb_image.h:404
Definition: stb_image.h:429
STBIDEF stbi_uc * stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
STBIDEF void stbi_hdr_to_ldr_gamma(float gamma)
STBIDEF int stbi_info_from_file(FILE *f, int *x, int *y, int *comp)
STBIDEF void stbi_ldr_to_hdr_scale(float scale)
STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len)
STBIDEF int stbi_is_hdr_from_file(FILE *f)
STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen)
Definition: stb_image.h:405
STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp)
Definition: stb_image.h:402
STBIDEF int stbi_info(char const *filename, int *x, int *y, int *comp)
STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen)
STBIDEF int stbi_is_hdr(char const *filename)
unsigned char stbi_uc
Definition: stb_image.h:408
STBIDEF stbi_uc * stbi_load_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
STBIDEF float * stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
STBIDEF void stbi_image_free(void *retval_from_stbi_load)
STBIDEF char * stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen)
STBIDEF char * stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header)
STBIDEF char * stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen)
Definition: stb_image.h:400
STBIDEF stbi_uc * stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp)
STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
STBIDEF char * stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen)