Trending Research

Data Formulator 2: Iteratively Creating Rich Visualizations with AI

microsoft/data-formulator • 28 Aug 2024

To create rich visualizations, data analysts often need to iterate back and forth among data processing and chart specification to achieve their goals.

Code Generation Navigate

6,798

3.57 stars / hour

Paper
Code

LLM4Decompile: Decompiling Binary Code with Large Language Models

albertan017/LLM4Decompile • • 8 Mar 2024

Decompilation aims to convert binary code to high-level source code, but traditional tools like Ghidra often produce results that are difficult to read and execute.

HumanEval

4,987

2.43 stars / hour

Paper
Code

Cut Your Losses in Large-Vocabulary Language Models

unslothai/unsloth • • 13 Nov 2024

We implement a custom kernel that performs the matrix multiplications and the log-sum-exp reduction over the vocabulary in flash memory, making global memory consumption for the cross-entropy computation negligible.

29,807

2.32 stars / hour

Paper
Code

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

seal-rg/recurrent-pretraining • • 7 Feb 2025

We scale a proof-of-concept model to 3. 5 billion parameters and 800 billion tokens.

Language Modeling Language Modelling

505

1.70 stars / hour

Paper
Code

Light-A-Video: Training-free Video Relighting via Progressive Light Fusion

bcmi/Light-A-Video • • 12 Feb 2025

Second, leveraging the physical principle of light transport independence, we apply linear blending between the source video's appearance and the relighted appearance, using a Progressive Light Fusion (PLF) strategy to ensure smooth temporal transitions in illumination.

Image Relighting

193

1.60 stars / hour

Paper
Code

Magic 1-For-1: Generating One Minute Video Clips within One Minute

da-group-pku/magic-1-for-1 • • 11 Feb 2025

The key idea is simple: factorize the text-to-video generation task into two separate easier tasks for diffusion step distillation, namely text-to-image generation and image-to-video generation.

Image to Video Generation Text-to-Image Generation +1

394

1.39 stars / hour

Paper
Code

OmniParser for Pure Vision Based GUI Agent

microsoft/omniparser • • 1 Aug 2024

The recent success of large vision language models shows great potential in driving the agent system operating on user interfaces.

Ranked #10 on Natural Language Visual Grounding on ScreenSpot

Natural Language Visual Grounding

8,419

1.37 stars / hour

Paper
Code

PIKE-RAG: sPecIalized KnowledgE and Rationale Augmented Generation

microsoft/pike-rag • 20 Jan 2025

Despite notable advancements in Retrieval-Augmented Generation (RAG) systems that expand large language model (LLM) capabilities through external retrieval, these systems often struggle to meet the complex and diverse needs of real-world industrial applications.

Language Modeling Language Modelling +3