Embedded multiprocessors pose new challenges in the design and implementation of embedded software. This has led to the need for programming interfaces that expose the capabilities of the underlying hardware. In addition, for systems that implement applications consisting of multiple concurrent threads of computation, the optimized management of inter-thread communication is crucial for realizing high-performance. This paper presents the design of an application-adaptive thread library that conforms to the IEEE POSIX 1003.1c threading standard (P-threads). The library adapts the placement of both explicitly marked application data objects, as well as implicitly created data objects, in physically distributed on-chip memory architecture, based on the application's data access characteristics.