With an exponential explosive growth of various digital text information, it is challenging to efficiently obtain specific knowledge from massive unstructured text information. As one basic task for natural language processing (NLP), relation extraction (RE) aims to extract semantic relations between entity pairs based on the given text. To avoid manual labeling of datasets, distant supervision relation extraction (DSRE) has been widely used, aiming to utilize knowledge base to automatically annotate datasets. Unfortunately, this method heavily suffers from wrong labelling due to its underlying strong assumptions. To address this issue, we propose a new framework using hybrid attention-based Transformer block with multi-instance learning for DSRE. More specifically, the Transformer block is, for the first time, used as a sentence encoder, which mainly utilizes multi-head self-attention to capture syntactic information at the word level. Then, a novel sentence-level attention mechanism is proposed to calculate the bag representation, aiming to exploit all useful information in each sentence. Experimental results on the public dataset New York Times (NYT) demonstrate that the proposed approach can outperform the state-of-the-art algorithms on the adopted dataset, which verifies the effectiveness of our model on the DSRE task.