Detection of Genomic Uracil Patterns - PMC

15 Jul.,2024

 

Detection of Genomic Uracil Patterns - PMC

Investigation of DNA damage, repair, and epigenetic base modifications became a rapidly developing scientific field, especially in the last decade, fed by numerous new technical solutions such as a new generation of DNA sequencing approaches [72,73]. Li and Sancar provided a comprehensive overview on crucial methods and developments in the field of genome-wide DNA-damage mapping approaches; however, uracil was not fully covered in their work [72]. Another review from Sturla&#;s lab focuses on NGS-based DNA damage sequencing methods, providing a thorough categorization based on the different technical solutions for library preparation [73].

Check now

Here, we provide a summary of diverse uracil-DNA detection methods with their advantages and limitations, and discuss their results and conclusions. We detail the global quantitative U-detection methods as well as various emerging solutions for in situ uracil-DNA detection. Regarding the genome-wide mapping methods, relevance, and benefits of single base resolution, as well as the potential pitfalls in data analysis, are also considered.

As an independent approach, a new U-DNA sensor protein was developed for multiple purposes, including quick one-step semi-quantitative dot blot application where uracil is directly recognized, without any further enzymatic or chemical reactions [ 28 ]. For this, the inactive D145N/H268N double mutant of human UNG was used as a starting construct from which the N&#;-terminal 84 residues were deleted to eliminate undesired protein&#;protein interaction surfaces (ΔUNG sensor) [ 87 , 88 ]. It was demonstrated that such UNG-based sensor equipped with 3xFLAG tag is an appropriate tool in dot blot application to quantify uracil as compared to a standard with known uracil content [ 28 ]. Uracil levels were determined in both bacterial (wt, ung &#;/&#; , and also ung &#;/&#; dut &#;/&#; double mutant E. coli), and higher eukaryotic (Drosophila S2 cells, as well as human colon cancer cell line HCT116) genomes upon treatments with thymidylate biosynthesis inhibitory drugs. The fast and straightforward dot blot applications do not require mass spectrometry infrastructure; however, the mass spectrometric methods provide higher accuracy, especially at low uracil levels.

Another approach utilizes alkoxyamine-based aldehyde reactive probes (ARPs) to chemically label the aldehyde group in the deoxyribose moiety at AP sites [ 79 ]. Biotinylated ARP reagents were used for the detection of oxidative base damages and AP sites on (ELISA-like) dot blot application [ 80 ]. The Ung-ARP assay was developed in Bennett&#;s group, where specific enzymatic removal of the uracil and detection of the resultant AP sites by biotinylated ARP reagent were combined [ 81 ]. Further developments led to two alkoxyamine reagents, AA3 and AA6, associated with increased reactivity and functional groups, appropriate to conjugate with a wide variety of biochemical labels by click chemistry [ 82 , 83 ]. These reagents were used in different applications [ 84 , 85 , 86 ].

The most straightforward way to quantify the overall uracil content of a DNA sample is a liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS) method [ 74 ]. It is based on enzymatic digestion of the DNA to 2&#;-deoxy-ribonucleosides using DNase I and nuclease P1, followed by a preparative HPLC purification coupled MS/MS identification of deoxyuridine (dUrd) and employs an isotope labelled internal standard. With this approach, and systematically addressing possible technical pitfalls, the uracil content of the murine and the human genome was determined to be ~0.15 and ~0.08 uracil/10 6 bases, respectively, considerably lower than suggested previously by other MS-based methods [ 75 , 76 , 77 , 78 ]. UNG deficiency led to some increment up to ~1.2 and ~0.35 uracil/10 6 bases, respectively.

These in situ detection solutions provide potent tools within highly different biological samples for relatively quick and efficient detection of genomic uracils either upon their increasing levels (e.g., drug treatments [ 91 ]) or upon spatial clustering into genomic loci (e.g., targeted enzymatic cytosine deamination [ 85 ]). Earlier, in situ detection method for AP sites were available [ 77 ], but their application for detection of genomic uracil is not straightforward at all. With the new direct approaches described above, it is also possible to identify (or even screen for) new biological situations and/or conditions, where particular patterns emerge indicating special potentially novel biological roles of genomic uracil. In combination with genome-wide localization data that could provide good candidate protein markers for in situ colocalization studies, these sensor constructs and the coupled detection strategies might especially be powerful. Such in cell studies are more cost-efficient and flexible for the wide screening of treatment-induced changes in various biological samples as compared to the genome-wide NGS-based sequencing methods. However, without the knowledge of genome-wide distribution, just the in situ detection alone could not provide essential new insight to the diverse roles of genomic uracil. Hence, combination of the two approaches is indispensable.

The mycobacterial UdgX that forms a covalently trapped complex with uracil-DNA was also employed for U-DNA detection [ 92 ]. The UdgX enzyme belongs to a bacterial UDG family harboring an Fe&#;S cluster, with a sequence motif KIRRH (called R-loop) essential for its function. Three-dimensional (3D) structures of UdgX complexes together with LC-MS/MS analysis revealed a covalent link between His109 of the KIRRRH track and the deoxyribose at the AP site [ 93 , 94 , 95 ]. It is straightforward to use this unique enzyme as a uracil-DNA sensor, similarly to the ΔUNG described above. An mCherry tagged UdgX was constructed and characterized in detail as a highly sensitive U-DNA sensor for detecting U-DNA by confocal microscopy in wild type (wt) and ung &#;/&#; dut &#;/&#; E. coli [ 96 ]. A FLAG-tagged UdgX-based sensor was also used in human cells to detect uracils in ssDNA arising upon the induction of APOBEC3A in cisplatin-treated HEK293T cells, revealing uracil colocalization with replication protein A in stalled replication forks [ 85 ]. An advantage of this sensor might be that it can be applied by transfection into living cells, and then only the immunodetection steps have to be carried out in fixed samples. UdgX demonstrated a strong (3&#;5 orders of magnitude) preference towards uracils in ssDNA context over U:A in dsDNA that, on one hand, ensures a selectivity, and on the other hand, somewhat limits its applicability for addressing certain biological questions. It is important to note that the SNAP-tagged ΔUNG construct recognizes uracils within fixed cells in all those contexts that are normally recognized by the wt UNG (ssU, U:A and U:G pairs).

Two early approaches, radio labelling [ 89 ] and a modified comet assay [ 90 ], were applied for in situ U-DNA detection, allowing quantitation with only low resolution. Recently, the ΔUNG sensor described above was further developed to allow in situ detection as well. First, fluorescently tagged ΔUNG sensors were shown to be appropriate for the in situ detection of uracil-containing exogenous plasmid DNA within the context of eukaryotic cells [ 28 ]. Later, this sensor was equipped with SNAP-tags and used in super-resolution fluorescent microscopy (STED and dSTORM) to detect endogenous genomic uracils in human cells [ 91 ]. The FLAG-tagged ΔUNG sensor was also used for genome-wide mapping (U-DNA-Seq), and the combination of the two approaches revealed that the RTX and the 5FdUR treatment induced uracil-enriched regions/loci colocalized with the active histone mark, H3K36me3, and the facultative heterochromatin mark, H3K27me3, respectively [ 91 ].

While these approaches above rely on specific enzymatic reactions by UNG and APE, a fully independent PCR-based method utilizes the altered sensitivity of archaeal DNA polymerase Pfu and its V93Q mutant version for the uracil-containing templates [ 27 ]. The structural basis and the functional consequences of binding of the archaeal family B polymerases to uracil bases in the template DNA strand has already been well described [ 101 ]. While a single uracil base can eventually stall DNA synthesis by wt Pfu polymerase, V93Q mutant Pfu preserves its activity even on fully uracil substituted templates [ 102 ]. Applying wt and V93Q mutant Pfu in parallel PCR reactions using the same template dilution series, from the difference between the corresponding Cq values, the uracil can be quantified within the given template region defined by the two PCR primers [ 27 ].

Another PCR-based approach simply quantifies the difference between the amounts of intact templates in the samples pre-treated with UNG alone or UNG + APE as compared to the non-treated one. Such quantification could be performed by qPCR, and also by the more convenient digital PCR techniques (such as digital droplet PCR (ddPCR)). In the ex-ddPCR method, samples are treated with UNG, and the fraction of amplicons containing at least one uracil on each strand is determined from positive PCR counts in the treated and un-treated samples [ 100 ]. It was shown that the viral gag gene accumulates uracils only in monocyte-derived macrophages (MDM), but not in T cells.

Almost at the same time as the publication of 3D-PCR, the Gearhart group applied a combined in vitro reaction of UNG and APE1 to detect uracil in an exogenous plasmid from which AID was expressed in bacteria [ 98 ]. They could show AID induction-related increases in the number of nicks by a UNG/APE reaction on alkaline agarose gel, and could locate the uracil moieties on the non-transcribed strand using a denaturing Southern blot. Furthermore, they also applied polymerase β without the addition of dNTPs, just using its 5&#;dRP hydrolyzing function to introduce nicks to the site of the uracil. Then, they applied primer extension on the nicked template using a specific and biotinylated primer that results in dsDNA end that is appropriate for blunt end adapter ligation. By clonal sequencing of the products of this ligation mediated PCR (LM-PCR), they could localize the original positions of uracils within the non-transcribed strand of the AID-expressing plasmid with single base resolution [ 98 ]. Later, they further developed this technique and successfully detected uracils in the immunoglobulin genes of ung &#; deficient AID expressing B cells as compared to ung &#; deficient Aicda &#;/&#; cells (ung &#; deficient chicken DT40 either overexpressing the chicken AID or Aicda &#;/&#; clones, and B220+GL7+ spleen cells from ung &#;/&#; and ung &#;/&#; , Aicda &#;/&#; mice that were either immunized right before the cells were isolated, or the isolated cells were stimulated ex vivo with LPS and Il-4) [ 99 ].

To localize uracils within a target DNA sequence, several PCR-based methods are available that can provide either an indication for the presence of U:G pairs, or exact localization (even with single-base resolution), or accurate quantitation (within the target sequence) of the uracils. The first published technique to detect C:G to T(U):A transitions due to cytosine deamination in DNA was the differential DNA denaturation PCR (3D-PCR) [ 97 ]. This technique relies on the lower denaturation temperature (Td) of DNA templates with higher AT content. It applies gradiently lowered Td in the PCR reactions, amplifying a specific target sequence defined by the two PCR primers. The specific PCR product could be detected already at lower Td in those cases where some C:G to T:A transitions happened in the template DNA within the amplified region [ 97 ]. Later, this 3D-PCR technique was applied in combination with a UNG inhibitor, UGI, to detect uracil-DNA intermediates of APOBEC3A-catalyzed cytosine deamination in a reporter plasmid DNA [ 22 ].

2.4. NGS Based U-DNA Detection for Genome-Wide Mapping

All methods described above are PCR-based; hence, they are limited to determination of a local uracil content that can be valid for the whole DNA sample as much as the genome-wide distribution of the uracil is uniform. Depending on the origin of the uracil-DNA, its genome-wide distribution can be more or less patterned: enzymatic cytosine deamination might result in a strongly targeted localization (e.g., focusing on variable and switch regions of the Ig genes, or kataegic-like clusters [25]), while spontaneous cytosine deamination and thymine-replacing misincorporation due to the insensitivity of the DNA polymerases are more stochastic and random processes. In these latter cases, if any pattern exists, it should be originated from several additional mechanisms, such as altered accessibility of the differential packaged genomic DNA, the unequal distribution of repair processes, and different polymerases with altered sensitivity and specificity. Indeed, in the last decade, numerous new results support the hypothesis that distinct repair proteins [103] and/or different polymerases [104] are loaded to certain genomic loci rather differently. Since , many NGS-based approaches have been published addressing epigenetic marks (ChIP-seq), or DNA methylation [105], or DNA repair loci (e.g., XR-seq, HS-XR-seq, and Damage-seq [50,106,107]), or other base modifications (e.g., OG-seq [108], click-code-seq [109]), or AP sites (e.g., snAP-seq [103]). Similarly, genome-wide uracil mapping solutions have been developed and are becoming crucial to better understand the significance and consequences of uracil appearance in DNA within the different biological contexts. For these &#;seq&#; methods, the PCR-based uracil localization or quantifying techniques described above provide essential validation opportunities.

The first published method applied for genome-wide uracil mapping was the Excision-seq applied in E. coli and yeast [110]. Excision-seq also operates with the coupled enzymatic reactions of bacterial UNG and the AP endonuclease, ENDO IV, and combines this with massively parallel DNA sequencing (NGS). Two versions had been developed: the pre-digestion ( a) and the post-digestion Excision-seq ( b). The pre-digestion version requires high uracil content within the studied DNA sample that allows efficient DNA fragmentation already by the in vitro UNG/ENDO IV enzymatic treatment. Then, applying a size selection without additional fragmentation procedure, the sequencing library is prepared. The ligation position of the sequencing adapter at the 5&#; ends will report on the original sites of uracils with practically a single-base resolution ( a), similarly to the ligation mediated PCR method [99] described above. As a complementary approach, post-digestion Excision-seq applies UNG/ENDO IV treatment on the prepared DNA fragment library, and the increased read coverage in the sequencing results of the excised samples compared to the non-treated controls indicates genomic regions from which uracils were excluded. However, the sensitivity of such inverse approach highly depends on the sequencing depth and requires a rather uniform genome coverage, which might limit the size and the complexity of the genomes addressed by this technique. The two versions of Excision-seq were reported as adequate methods for the efficient detection of elevated uracil levels upon dUTPase and UDG deficiency in smaller genome sizes, as in E. coli and yeast strains [110]. They concluded that uracil is excluded from the very early and very late replication timing genomic segments and assumed that such regulation might involve the alterations of the cellular dNTP pool during the DNA synthesis [110]. Nevertheless, a larger genome size (e.g., mammalian genomes) and the low frequency and/or the nature of the distribution of uracils might result in some biases or underestimation using Excision-seq method, especially its pre-digestion version. In this aspect, enrichment or pull-down-based methods might be more efficient. Moreover, it has not yet been demonstrated how beneficial the single base resolution capability of pre-digestion Excision-seq is. Indeed, the same group used this method addressing the 10 kb-size HIV genomes from different in vitro infected immune cells showing uniformly distributed uracilation of the proviral genome [100]. Although this method was also extended to the mapping of other DNA base modifications [111] and cited by reviews or other research papers, Excision-seq has not yet become widely used to characterize other biological systems. In one case, pre-digestion Excision-seq was applied as a complimentary technique to support the results from dU-seq (which will be discussed later [112]).

Meanwhile, also attempting single base resolution detection of uracil (and other DNA lesions) within the DNA, two other approaches were developed on model DNA and proposed to be used in genomic context too by Burrows&#; lab [113,114]. Their first method also relies on the UNG/APE enzymatic treatment (or more generally, the other base modification-specific glycosylases and the appropriate AP endonuclease or AP lyase) which is followed by enzymatic labelling of the gapped strand by unnatural nucleotides (dNaM or dMMO2). The bases of these nucleotides are selectively paired with d5SICS unnatural base in PCR reactions forming unnatural base pairs (UBPs). Such UBPs can then be detected either by Sanger sequencing (UBPs stop the seq reactions), or by nanopore sequencing technology, where the position of UBPs can be determined with single nucleotide resolution in the context of single DNA duplexes [113,115]. This method was developed and tested on synthetic DNA models, simulating biologically relevant lesions with their heterogeneous sequence context and also the effect of a large excess of undamaged DNA [113]. It was also suggested that, in combination with an enrichment of DNA lesion-containing DNA-strands, the method can be adapted for complex biological systems. Their second method relies on ligatable gaps that arose upon in vitro treatment by the glycosylase specific for a given modified base; then, sequencing of ligated products by any commonly used NGS techniques and identification of single nucleotide deletions will report on the position of the original DNA lesions [114]. Although this approach seems to be cheaper and more available for the wide scientific community, it has not yet been demonstrated that it could work on large genomes, especially with low uracil content. Basically, the limitations of this method should be similar to those in the pre-digestion Excision-seq. Indeed, the authors suggest that its best application might be in single cell sequencing, where a certain base modification is present at 100% (note: 50% in case of diploid genomes). Furthermore, the relatively high chance of a single nucleotide deletion as a consequence of sequencing error or naturally occurring variation within the sample can also impair the sensitivity.

Two other sequencing methods with single base resolutions, which were developed for the detection of AP sites but can easily be adapted for uracil detection by applying preceding UNG treatment (as is also true for other seq methods designed for AP sites), are also worth being presented here. In Balasubramanian&#;s lab, snAP-seq was developed and used in different size of genomes and for answering different biological questions [103] ( a). snAP-seq applies a selective chemical labelling of AP sites [116], and enrichment via a biotin&#;streptavidin system. They demonstrated the selectivity of their method for AP site aldehyde over the formylcytosine aldehyde, via combination of the chemical labelling of the aldehyde groups and the elution from the streptavidin resin using an alkaline condition that hydrolyzes the sugar&#;phosphate backbone at the AP site. First, chemical labelling is performed on the fragmented DNA, then the P7 sequencing adapter is ligated, followed by the pull-down on streptavidin beads and selective elution by alkaline cleavage. Using the P7 adapter, a primer extension is performed on the eluted ssDNA fraction, then only the AP cleavage-related 5&#;-phosphates are available for ligation with the other sequencing adapter, P5. Hence, the enrichment of relevant DNA fragments is quite efficient, and a majority (95%) of the sequenced fragment will start exactly one base downstream the original AP sites, as it was measured in a model DNA. This method was applied for single base resolution detection of hmU in Leishmania major [103], where hmU is a precursor of the epigenetic marker base J and is supposed to be introduced enzymatically [117,118,119]. The method was also used in human cell lines to detect AP sites upon the silencing of APE1; however, its single base resolution potential could not really be exploited in this latter case, due to more randomized genomic distribution of the AP sites [103].

The other similarly creative method is Nick-seq, developed by Dedon&#;s group for single-nucleotide resolution genomic maps of different DNA modifications and damage [120] ( b). The method relies on conversion of the modified bases into single strand breaks on which two different types of polymerase reactions are performed separately. One portion of the sample is subjected to nick translation using α-thio-dNTPs to produce hydrolysis-resistant oligonucleotides downstream of the single strand break. Hence, the rest of the DNA can be selectively removed by exonuclease III and RecJ, and the resistant phosphorothioate-containing oligos can be sequenced. The second portion of the sample is used for poly(dT) tailing of the 3&#; end at the strand break by terminal transferase (TdT). Then, this tail is used for library preparation. By this approach, sequencing of the two separately processed samples can confirm the position of a base modification from two directions.

A novel genome-wide uracil detection method, dU-seq, was also combined with the pull-down technique [112] ( a). This method applies an enzymatic cascade including E. coli UDG to convert genomic uracils into AP sites that are cleaved by ENDO IV, and the gaps are resynthesized by Bst DNA polymerase in the presence of biotinylated nucleotide triphosphate. The biotinylated DNA fragments then pull down on streptavidin beads, where the Y adapter for sequencing is ligated before the elution is conducted for 3 min at 95 °C in distilled water. The eluted DNA fragments are amplified by PCR and sequenced by Illumina. Prior to the enzymatic treatments, repair of AP sites, ssDNA breaks, and ssDNA ends were performed. The input and the enriched samples were sequenced, and peak calling was performed by model-based analysis of ChIP-seq (MACS2) software. Only peaks uniquely present in the pull-down versus the control were considered in the consequent analysis. Uracil enrichment within the centromeres was reported, which was also confirmed to some extent by independent methods including pre-digestion Excision-seq, LC-MS/MS, and 3D-PCR.

The UDP-seq method, another DNA-IP-seq application, is quite similar to dU-seq except that the introduction of the biotin label to the uracil sites is performed in a non-enzymatic chemical reaction [86] ( b). The alkoxyamine moiety of the commercially available reagent EZ-Link Alkoxyamine-PEG4-SS-Biotin (ssARP) (Thermo Scientific) covalently labels the opened ring of the base-free deoxyribose at the AP site. The S&#;S bridge allows efficient elution by reducing agents such as dithiothreitol (DTT). Chemical blocking of pre-existing AP sites that otherwise could interfere with uracil detection is necessary. Application of UDP-seq in bacterial systems showed, on one hand, that upon dUTPase and UNG deficiency, the elevated uracil incorporation occurs mostly at the replication origin. On the other hand, the ectopically expressed APOBEC3A (A3A) catalyzed cytosine deamination patterns were addressed in E. coli, where both UDGs are mutated. The control pull-down without UNG treatment to check for non-specific binding was omitted ( b vs. a), but control samples were introduced: active A3A-expressing cells were compared to either inactive A3A or empty plasmid-containing cells. In the data analysis, peak calling was performed with MACS, and a normalized differential coverage (NDC) for 100 bp moving window was calculated. A uracilation index (UI) was introduced to measure the frequency of TC to TT transitions specific for the A3A, and a preference of A3A activity was detected for the lagging strand during replication, as well as in short hairpin loops, tRNA and rRNA genes, and in the 5&#; termini of some protein coding genes.

In both dU-seq and UDP-seq, numerous enzymatic/chemical steps are involved, resulting in a complex arrangement with multiple potential pitfalls. Additionally, abasic sites independent from uracils need to be carefully considered. Moreover, sticky DNA ends could influence the polymerase-based labelling approach in dU-seq. Accordingly, both dU-seq and UDP-seq are based on well-established experimental setups and take advantage of the highly efficient biotin&#;streptavidin pull-down system. Processing and the interpretation of the NGS data involve several critical issues that will be detailed in Section 3 and Section 4.

The most recent pull-down-based method, U-DNA-Seq, employs U-DNA-specific binding of the FLAG-tagged ΔUNG sensor (already described above) to pull down uracil-containing genomic DNA fragments. As such, it is a more direct method and involves less complex steps than dU-seq and UDP-seq ( c) [91]. It is independent from the efficiency of different enzymatic/chemical reactions used in dU-seq and UDP-seq and relies on the specificity and affinity of the interactions between ΔUNG and U-DNA, and the anti-FLAG antibody and the FLAG-tag, respectively. In the published paper, U-DNA-Seq was applied in the human cancer cell line HCT116, and its mismatch repair proficient version, where the UNG inhibitor UGI was stably expressed [91]. The effects of two thymidylate synthase inhibitory drugs, 5FdUR and RTX, on the genomic uracil content and its distribution were addressed. Using their own analysis pipeline, remarkably high reproducibility among replicates (cf. Supplementary Materials in [91]) was presented even when results were compared to relevant samples from the published dU-seq data remapped and re-analyzed by the same pipeline.

The experimental design, the applied biological models, and data analysis, as well as basic conclusions of the studies using the above described pull-down-based U-DNA mapping methods are summarized in .

Table 1

dU-Seq UDP-Seq U-DNA-Seq Genomehuman E. coli humanGene deficiencyung&#;/&#;
wtung&#;/dut&#; vs. wt,
ung&#;/mug&#;hmlh1&#;/&#;
Restored MMRTransgene-
ung
UDGs-
A3A, A3A*
EVugi
-Treatment-
5FdUR-
-5FdUR, RTX
-Data pre-processingTrim, Align Trim, Align, FilterTrim, Align, Blacklist, FilterEnrichment analysisPeak callingPeak calling, NDC, UICoverage, log2 ratio, broad regionsConclusionCentromeric enrichmentung&#;/dut&#;: replication origin
A3A in ung&#;/mug&#;: lagging strand, hairpin loops, tRNA genesHeterochromatin, upon treatments: shifted towards early replicating and active euchromatinOpen in a separate window

It is interesting to consider if a similar pull-down coupled sequencing method could be developed based on UdgX. Sequencing over the crosslink might be challenging, although may not be impossible to solve. Considering that UdgX has strong preference towards uracils in ssDNA [85], and once it binds to it, other UDGs cannot initiate its conversion to AP sites anymore [92], such an approach could provide selective and safe enrichment of otherwise more vulnerable uracil-containing ssDNA fragments directly from cells. Moreover, high throughput analysis of the crosslinked peptide&#;DNA fragments by mass spectrometry could also provide single base resolution data wherever it is interesting.

All of the methods described above potentially capable or used for genome-wide mapping of uracil moieties have advantages and limitations. The Excision-seq (pre- and post-digested versions), dU-seq, snAP-seq, UDP-seq, Nick-seq and U-DNA-Seq were used for genome-wide studies at different levels of genomic complexity, and for different biological samples with markedly different level and origin of uracil bases. Thus, the evaluation of the sequencing results as well as drawing conclusions requires appropriate considerations.

CNB - Method for purifying uracil compounds

The invention relates to a purification process of a medical intermediate, in particular to a purification method of uracil compounds.

Uracil compounds are widely applied to preparation of pharmaceutical intermediates and pesticide products, and uracil molecular rings in the drug molecules are mostly formed by direct introduction or ring closure of small molecules. Because the number of active site atoms in the uracil molecular skeleton is large, side reactions are more in the uracil molecular synthesis process, the polarity of products and byproducts is large, and the solubility in most solvents is poor. The post-treatment process of the reaction is complex, and the small-batch reaction solution can be purified by methods such as column chromatography and the like. However, when producing uracil compounds at kilogram level or ton level, the cost of purifying crude products by column chromatography is high, which is not convenient for popularization in industrial production. Because the impurities in the preparation process of the uracil compound are close to the properties of the uracil compound, the common recrystallization purification process is adopted, and the problems that the recrystallization purification degree is not high, the purification effect is not ideal, and the product contains pigment molecules, so that the product quality is seriously influenced are solved. Therefore, the crude uracil compound cannot be purified by a low-cost conventional recrystallization method to obtain the high-purity uracil compound solid. This severely limits the commercial mass production of high quality uracil compounds.

Disclosure of Invention

The technical problem to be solved by the invention is to provide a method for purifying the uracil compound, and the method is used for carrying out gradient recrystallization on the uracil compound and further pulping and purifying the solid obtained by recrystallization to obtain the uracil compound with higher purity.

Technical scheme

A method for purifying uracil compounds, comprising the steps of:

(1) thermally dissolving the crude product of the uracil compound in a polar aprotic solvent to obtain a crude product solution of the uracil compound;

Guangxing contains other products and information you need, so please check it out.

(2) carrying out gradient cooling on the uracil compound crude product solution to separate out a solid;

(3) pulping and purifying the solid obtained in the step (2) by using acetone, further optimizing the particle size of the product, and fully dissolving pigment impurities in the acetone to obtain a pure uracil compound.

According to the technical scheme, the uracil compound is preferably dissolved at the temperature of 80-120 &#;, and insoluble substances are filtered to obtain a crude uracil compound solution. The filtered crude uracil compound solution still has more pigment in the solution. As the coloring of pigment molecules is obvious, only a very small amount of pigment impurities are needed, the pigment is contained in the uracil compound prepared by recrystallization, the color of the raw material is grey, the product quality is influenced, and the application value of the product is reduced. The conventional recrystallization purification technology cannot completely eliminate the pigment in the solid particles, so that the purification is not complete. The technical scheme adopts a gradient crystallization method, and controls the number and the size of crystals to obtain fine powdery solids. And then selecting a proper pulping solvent and a proper pulping time to fully dissolve the pigment molecules in the solid particles in the pulping solvent, thereby realizing the high-efficiency purification of the product.

In the technical scheme, acetone is used as a solvent, and compared with other solvents, the acetone can effectively dissolve the pigment and has proper solubility on a product. The term "proper solubility" in the technical scheme means that the solubility of acetone to the product is not too high, so that a large amount of dissolution loss of the product cannot be caused; and the dissolving capacity of the product is not very weak, and the operability of pulping and purifying is not realized. And the moderate solubility of the acetone to the product enables the recrystallized grains to realize dissolution-recrystallization of the grain surfaces in the beating process within the operable time (such as several minutes to several hours or even more than ten hours), and the pigment in the grains is released into the acetone in the process of the dissolution-recrystallization of the grains, so that the grains are purified.

Further, the operation step of gradient temperature reduction and solid precipitation in the step (2) comprises the following steps: cooling to a first crystallization temperature to precipitate a solid, and cooling to a second crystallization temperature to continuously precipitate the solid; the first crystallization temperature is 20-60 &#;, and preferably 30-45 &#;; the second crystallization temperature is selected from-80 to 0 &#;, and is preferably-40 &#; to-10 &#;.

Further, the crystallization time at the first crystallization temperature is 0.5-5 h, preferably 0.5-1.5 h; the time for crystallization at the second crystallization temperature is 0.5 to 5 hours, preferably 0.5 to 1.5 hours.

Further, the polar aprotic solvent of step (1) is selected from DMF, DMA, NMP, DMSO, preferably DMF.

Further, the concentration of the crude product in the polar aprotic solvent in the step (1) is 0.2-5 mol/L, preferably 0.8-2 mol/L.

Further, the pulping time of the acetone in the step (3) is 1-20 hours, preferably 1.5-4 hours.

Further, seed crystals are added at the first crystallization temperature to induce crystallization to precipitate solids. According to the technical scheme, the percentage of the added seed crystal in the product content of the solution is controlled, so that the growth of the crystal can be controlled, the growth size of the crystal can be controlled more effectively, and the subsequent pulping treatment is facilitated.

Further, the step of preliminary purification of the reaction solution in the step (1) comprises: and mixing the reaction liquid containing the uracil compound with water or water-dichloromethane, stirring, filtering to obtain a solid, and drying to obtain a crude product of the uracil compound.

Further, the uracil compound has a general formula shown in formula I:

formula I

Wherein R1 and R2 are independently selected from hydrogen, hydroxyl, cyano, substituted or unsubstituted (C)1-6) Alkyl, R3 and R4 are independently selected from hydrogen, halogen and hydroxylRadical, cyano radical, substituted or unsubstituted (C)1-6) An alkyl group.

Further, the general formula of the uracil compound is shown as formula II:

formula II

R1, R2 and R3 are all hydrogen atoms, and R4 is selected from (C)1-6) The haloalkyl group of (1) is preferably a monochloromethyl group.

Further, when R1, R2 and R3 are all hydrogen atoms and R4 is monochloromethyl, the preparation process is that the compound of the formula III is reacted with n-butyl lithium and then quenched to obtain the compound. The structure of the formula III is as follows:

formula III

Wherein: r1 and R2 are both hydrogen atoms; r3 is selected from halogen, preferably an iodine atom; r4 is monochloromethyl.

Further, when R1, R2 and R3 are all hydrogen atoms and R4 is monochloromethyl, the preparation process is that the compound shown in the formula IV is obtained by chlorination reaction. Further, the chlorination reaction is obtained by adopting thionyl chloride to be chlorinated in DMF. The structure of formula IV is:

formula IV

Wherein: when all of R1, R2 and R3 are hydrogen atoms; r4 is hydroxymethyl.

Further, when R1, R2 and R3 are all hydrogen atoms and R4 is monochloromethyl, the preparation process is that the compound shown in the formula V is obtained by hydrolysis reaction. The structure of formula V is:

formula V

Wherein: r7 is selected from substituted or unsubstituted (C)1-6) An alkyl group; r8 and R10 are hydrogen atoms; r9 is chloromethyl.

Advantageous effects

The invention provides a method for purifying uracil compounds, which realizes the purification of crude uracil compounds by optimizing recrystallization purification conditions and combining with an improved pulping purification process, and the process is convenient for kilogram-level or higher-scale amplification reaction, improves the production efficiency and reduces the purification cost. Particularly, the invention adopts the solid crude product of primary purification, effectively reduces the impurity content, improves the product form and is convenient for recrystallization. And a gradient recrystallization process is adopted subsequently to further purify the product, and the size of the solid form is optimized by controlling the recrystallization condition, so that during subsequent pulping and purification, re-dissolution-precipitation is realized on solid particles in operable time, pigment molecules coated and clamped in the particles are dissolved in acetone, the full purification of the product is realized, and pigment impurities coated in the particles and difficult to remove by common recrystallization or pulping technical means are removed.

Compared with other solvents, the acetone adopted as the pulping solvent has good solubility to pigment molecules and proper solubility to products, so that the surface of the solid particles is dissolved and recrystallized within an operable time in the pulping process, the pigment in the particles is completely dissolved in the acetone, and the uracil compound with high purity is obtained.

The purity of the uracil compound obtained by purification according to the technical scheme can reach 99.9%, so that the purification method provided by the technical scheme overcomes the defects of low column chromatography purification efficiency and high cost in the prior art, also overcomes the defect that impurity molecules in the uracil compound cannot be completely removed by pulping or recrystallization in the prior art, and remarkably improves the purification efficiency and purification effect of the uracil compound. Compared with other purification processes, especially column chromatography purification, the process is more suitable for kilogram-level or ton-level scale production, and the production cost is reduced.

If you are looking for more details, kindly visit uracil raw materials.