• Reducing overheads of dynamic scheduling on heterogeneous chips 

      Corbera, Francisco; Rodríguez, Andrés; Asenjo-Plaza, Rafael; Navarro, Ángeles; Vilches, Antonio; [et al.] (arXiv.org (Cornell University Library), 2015-01-19)
      In recent processor development, we have witnessed the integration of GPU and CPUs into a single chip. The result of this integration is a reduction of the data communication overheads. This enables an efficient collaboration ...
    • Reorganización de matrices en algoritmos de barrido radial sobre Modelos Digitales del Terreno 

      Romero-Gomez, Luis Felipe; Tabik, Siham; Bandera-Burgueño, Gerardo (2019-09-23)
      Es muy frecuente, en los sistemas de información geográfica que trabajan con modelos digitales del terreno, el uso de algoritmos de barrido radial para el estudio de variables asociadas a parámetros cuya magnitud decrece ...
    • Reproducible SUmmation under HUB Format 

      Villalba-Moreno, Julio; Hormigo-Aguilar, Javier; Jaime Rodriguez, Francisco Jose
      Floating point reproducibility is a property claimed by programmers and end users. Half-Unit-Biased (HUB) is a new representation format in which the round to nearest is carried out by truncation, preventing any ...
    • Robust tracking for augmented reality 

      González-Linares, José M.; Guil, Nicolás; Ramos Cózar, Julián (2015-06-17)
      In this paper a method for improving a tracking algorithm in an augmented reality application is presented. This method addresses several issues to this particular application, like marker-less tracking and color constancy ...
    • Scalability Analysis of Signatures in Transactional Memory Systems 

      Quislant, Ricardo; Eladio, Gutiérrez; Óscar, Plata (2014-10-29)
      Signatures have been proposed in transactional memory systems to represent read and write sets and to decouple transaction conflict detection from private caches or to accelerate it. Generally, signatures are implemented ...
    • A scheduling theory framework for GPU tasks efficient execution 

      Lázaro Muñoz, Antonio José; López Albelda, Bernabé; Gonzalez-Linares, Jose Maria; Guil-Mata, Nicolas (2018-07-16)
      Concurrent execution of tasks in GPUs can reduce the computation time of a workload by overlapping data transfer and execution commands. However it is difficult to implement an efficient run- time scheduler ...
    • Simplified Floating-Point Units for High Dynamic Range Image and Video Systems 

      Hormigo-Aguilar, Javier; Villalba-Moreno, Julio (2015-06-29)
      The upcoming arrival of high dynamic range image and video applications to consumer electronics will force the utilization of floating-point numbers on them. This paper shows that introducing a slight modification on ...
    • Siting Multiple Observers for Maximum Coverage: An Accurate Approach 

      Romero, Luis F.; Tabik, Siham; Cervilla, Antonio R. (2015-06-05)
      The selection of the minimal number of observers that ensures the maximum visual coverage over an area represented by a digital elevation model (DEM) have great interest in many elds, e.g., telecommunications, environment ...
    • Smith-Waterman Acceleration in Multi-GPUs: A Performance per Watt Analysis 

      Pérez-Serrano, Jesús; Sandes, Edans; Melo, Alba; Ujaldon-Martinez, Manuel (Springer, 2017)
      We present a performance per watt analysis of CUDAlign 4.0, a parallel strategy to obtain the optimal alignment of huge DNA se- quences in multi-GPU platforms using the exact Smith-Waterman method. Speed-up factors and ...
    • Solución de múltiples sistemas lineales en GPUs 

      Molero, Jose Manuel; Plaza, Antonio; Martín-Garzón, Esther; García-Fernández, Inmaculada; Quintana-Ortí, Enrique S. (2013-11-05)
      Este trabajo se centra en el calculo, de forma concurrente, de múltiples sistemas lineales definidos por matrices densas de una dimensión media. Se considera una solución basada en la factorización de Cholesky y su ...
    • Solving Large-Scale Markov Decision Processes on Low-Power Heterogeneous Platforms 

      Constantinescu, Denisa-Andreea; Gonzalez-Navarro, Maria Angeles; Corbera, Francisco; Fernández-Madrigal, Juan Antonio; Asenjo-Plaza, Rafael (2019-07-11)
      Markov Decision Processes (MDPs) provide a framework for a machine to act autonomously and intelligently in environments where the effects of its actions are not deterministic. MDPs have numerous applications. We focus ...
    • Tasks Fairness Scheduler for GPU 

      López Albelda, Bernabé; Gonzalez-Linares, Jose Maria; Guil-Mata, Nicolas (2019-09-24)
      Nowadays GPU clusters are available in almost every data processing center. Their GPUs are typically shared by different applications that might have different processing needs and/or different levels of priority. As current ...
    • Three is not a crowd: ACPU-GPU-FPGA K-means implementation 

      Canales, Marcos; Cancer, Jorge; Constantinescu, Denisa; Escuin, Carlos; Perez, Borja (2017-06-15)
      Clustering is the task of assigning a set of objects into groups (clusters) so that objects in the same group are more similar to each other than to those in other groups. In particular, K-means is a clustering algorithm ...
    • Time Series Analysis Using Transprecision Computing 

      Fernández-Vega, Iván (2019-09-11)
      This work presents results using transprecision techniques for reducing the precision of the computation of time series analysis. The developed benchmark allows to explore how the accuracy of the results is affected by ...
    • Time Series Heterogeneous Co-execution on CPU+GPU 

      Romero, José Carlos; Gonzalez-Navarro, Maria Angeles; Rodriguez-Moreno, Andres; Asenjo-Plaza, Rafael; Cole, Murray (2019-07-10)
      Time series motif (similarities) and discords discovery is one of the most important and challenging problems nowadays for time series analytics. We use an algorithm called “scrimp” that excels in collecting the relevant ...
    • TMbarrier: speculative barriers using hardware transactional memory 

      Pedrero, Manuel; Gutiérrez Carrasco, Eladio D.; Plata, Oscar (2018-11-15)
      Barrier is a very common synchronization method used in parallel programming. Barriers are used typically to enforce a partial thread execution order, since there may be dependences between code sections before and after ...
    • Towards a Software Transactional Memory for heterogeneous CPU-GPU processors 

      Villegas, Alejandro; Navarro, Angeles; Asenjo-Plaza, Rafael; Plata, Oscar (2017-09-15)
      The heterogeneous Accelerated Processing Units (APUs) integrate a multi-core CPU and a GPU within the same chip. Modern APUs provide the programmer with platform atomics, used to communicate the CPU cores with the GPU using ...
    • Towards the intelligent diagnosis of hematological diseases 

      Díaz-Del-Pino, Sergio; Trelles-Martínez, Roberto; Perez-Wohlfeil, Esteban; Trelles-Salazar, Oswaldo Rogelio (2019-11-18)
      In traditional medicine, patient diagnosis usually implies an in depth study of its state and symptoms that a specialist has to carry out. The adaptation and customization of the medical treatment to those individual ...
    • A weakly-supervised approach for discovering common objects in airport video surveillance footage 

      Castro Payan, Francisco Manuel; Delgado-Escaño, Rubén; Guil-Mata, Nicolas; Marín-Jiménez, Manuel J. (2019-07-22)
      Object detection in video is a relevant task in computer vision. Standard and current detectors are typically trained in a strongly supervised way, what requires a huge amount of labelled data. In contrast, in this paper ...
    • Workflows and service discovery: a mobile device approach 

      Holthausen, Ricardo; Díaz-Del-Pino, Sergio; Pérez-Wohlfeil, Esteban; Rodríguez-Brazzarola, Pablo; Trelles-Salazar, Oswaldo Rogelio (Springer, Cham, 2018-03)
      Bioinformatics has moved from command-line standalone programs to web-service based environments. Such trend has resulted in an enormous amount of online resources which can be hard to find and identify, let alone execute ...