<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="static/style.xsl"?><OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"><responseDate>2026-05-30T02:32:59Z</responseDate><request verb="GetRecord" identifier="oai:riuma.uma.es:10630/14425" metadataPrefix="qdc">https://riuma.uma.es/rest/oai/request</request><GetRecord><record><header><identifier>oai:riuma.uma.es:10630/14425</identifier><datestamp>2026-02-03T12:08:58Z</datestamp><setSpec>com_10630_2254</setSpec><setSpec>col_10630_37959</setSpec></header><metadata><qdc:qualifieddc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:doc="http://www.lyncode.com/xoai" xmlns:qdc="http://dspace.org/qualifieddc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://purl.org/dc/elements/1.1/ http://dublincore.org/schemas/xmls/qdc/2006/01/06/dc.xsd http://purl.org/dc/terms/ http://dublincore.org/schemas/xmls/qdc/2006/01/06/dcterms.xsd http://dspace.org/qualifieddc/ http://www.ukoln.ac.uk/metadata/dcmi/xmlschema/qualifieddc.xsd">
   <dc:title>Hardware support for scratchpad memory transactions on GPU architectures</dc:title>
   <dc:creator>Villegas Fernández, Alejandro</dc:creator>
   <dc:creator>Asenjo-Plaza, Rafael</dc:creator>
   <dc:creator>González-Navarro, María Ángeles</dc:creator>
   <dc:creator>Plata-González, Óscar Guillermo</dc:creator>
   <dc:creator>Ubal, Rafael</dc:creator>
   <dc:creator>Kaeli, David</dc:creator>
   <dc:subject>Ordenadores - Equipo de entrada y salida - Congresos</dc:subject>
   <dcterms:abstract>Graphics Processing Units (GPUs) have become the accelerator of choice for data-parallel applications, enabling the execution of thousands of threads in a Single Instruction - Multiple Thread (SIMT) fashion. Using OpenCL terminology, GPUs offer a global memory space shared by all the threads in the GPU, as well as a low-latency local memory space shared by a subset of the threads. The latter is used as a scratchpad to improve the performance of the applications.&#xd;
We propose GPU-LocalTM, a hardware transactional memory (TM), as an alternative to data locking mechanisms in local memory. GPU-LocalTM allocates transactional metadata in the existing memory resources, minimizing the storage requirements for TM support. In addition, it ensures forward progress through an automatic serialization mechanism. In our experiments, GPU-LocalTM provides up to 100X speedup over serialized execution.</dcterms:abstract>
   <dcterms:dateAccepted>2017-09-06T06:41:12Z</dcterms:dateAccepted>
   <dcterms:available>2017-09-06T06:41:12Z</dcterms:available>
   <dcterms:created>2017-09-06T06:41:12Z</dcterms:created>
   <dcterms:issued>2017-08-29</dcterms:issued>
   <dc:type>conference output</dc:type>
   <dc:identifier>http://hdl.handle.net/10630/14425</dc:identifier>
   <dc:identifier>http://orcid.org/0000-0002-1570-3863</dc:identifier>
   <dc:language>eng</dc:language>
   <dc:relation>Euro-Par 2017: Parallel Processing</dc:relation>
   <dc:relation>Santiago de Compostela, Spain</dc:relation>
   <dc:relation>Agosto 2017</dc:relation>
   <dc:rights>open access</dc:rights>
   <dc:rights>by-nc-nd</dc:rights>
   <dc:publisher>Springer</dc:publisher>
</qdc:qualifieddc>
</metadata></record></GetRecord></OAI-PMH>